Authors: Pengcheng Li (TikTok Inc); Yixin Guo, Yingwei Luo, and Xiaolin Wang (Peking University); Zhenlin Wang (Michigan Technological University); and Xu Liu (North Carolina State University)
Abstract: Production software of data centers often suffers from unnecessary memory inefficiencies. Nevertheless, whole-program monitoring tools often incur incredibly high overhead due to fine-grained memory access instrumentation.
To this end, this work presents a novel learning-aided system, namely Puffin, to identify three kinds of unnecessary memory operations including dead stores, silent loads, and silent stores, by applying gated graph neural networks onto fused static and dynamic program semantics with respect to relative positional embedding. To deploy the system in large-scale data centers, this work explores a sampling-based detection infrastructure with high efficacy and negligible overhead. We evaluate Puffin upon the well-known SPEC CPU 2017 benchmark suite for four compilation options. Experimental results show that the proposed method is able to capture the three kinds of memory inefficiencies with as high accuracy as 96%, with a performance speed-up of 5.66x over the state-of-the-art tool.
Back to Technical Papers Archive Listing