GraphFly: Efficient Asynchronous Streaming Graphs Processing via Dependency-Flow
DescriptionExisting streaming graph processing systems typically adopt two phases of refinement and recomputation to ensure the correctness of the incremental computation. However, severe redundant memory accesses exist due to the unnecessary synchronization among independent edge updates. In this paper, we present GraphFly, a high-performance asynchronous streaming graph processing system based on dependency-flows. GraphFly features three key designs: 1) Dependency trees (Dtrees), which helps quickly identify independent graph updates with low cost; 2) Dependency-flow based processing model, which exploits the space-time dependent co-scheduling for cache efficiency; 3) Specialized graph data layout, which further reduces memory accesses. We evaluate GraphFly, and the results show that GraphFly significantly outperforms state-of-the-art systems KickStarter and GraphBolt by 5.81× and 1.78× on average, respectively. Also, GraphFly scales well with different sizes of update batch and compute resources.
Event Type
TimeWednesday, 16 November 20222:30pm - 3pm CST
Registration Categories
Big Data
Computational Science
Reproducibility Badges
Session Formats
Back To Top Button