Authors: Dan Chen, Chuangyi Gui, Yi Zhang, Hai Jin, Long Zheng, Yu Huang, and Xiaofei Liao (Huazhong University of Science and Technology (HUST))
Abstract: Existing streaming graph processing systems typically adopt two phases of refinement and recomputation to ensure the correctness of the incremental computation. However, severe redundant memory accesses exist due to the unnecessary synchronization among independent edge updates. In this paper, we present GraphFly, a high-performance asynchronous streaming graph processing system based on dependency-flows. GraphFly features three key designs: 1) Dependency trees (Dtrees), which helps quickly identify independent graph updates with low cost; 2) Dependency-flow based processing model, which exploits the space-time dependent co-scheduling for cache efficiency; 3) Specialized graph data layout, which further reduces memory accesses. We evaluate GraphFly, and the results show that GraphFly significantly outperforms state-of-the-art systems KickStarter and GraphBolt by 5.81× and 1.78× on average, respectively. Also, GraphFly scales well with different sizes of update batch and compute resources.
Back to Technical Papers Archive Listing