Authors: Menghan Jia (National University of Defense Technology (NUDT), China); Yiming Zhang (Xiamen University; National University of Defense Technology (NUDT), China); Xinbiao Gan and Dongsheng Li (National University of Defense Technology (NUDT), China); Erci Xu (Xiamen University; National University of Defense Technology (NUDT), China); and Ruibo Wang and Kai Lu (National University of Defense Technology (NUDT), China)
Abstract: To lower the monetary/energy cost, single-machine multicore graph processing is gaining increasing attention for a wide range of traversal-centric graph algorithms such as BFS, SSSP, CC, and PageRank, of which the processing is relatively simple and the topology data (vertices and edges) dominates the memory footprint. This paper presents vGRAPH, a NUMA-aware, memory-efficient multicore graph processing system for traversal-centric algorithms. vGRAPH proposes an ultralight NUMA-aware graph preprocessing scheme which eliminates almost all complex preprocessing steps and pipelines per-NUMA graph loading and compressing, to effectively reduce inter-NUMA memory accesses while keeping both preprocessing cost and peak memory footprint low. We further optimize vGRAPH with effective HPC techniques including prefetching and work-stealing. Evaluation on a 384GB-memory, four-NUMA machine shows that compared to the state-of-the-art NUMA-aware/-unaware systems, vGRAPH can process much larger real-world and synthetic graphs with various traversal-centric algorithms, achieving significantly higher memory efficiency and lower processing time.
Back to Technical Papers Archive Listing