Workshop: PMBS22: The 13th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems
Authors: Taylor Groves, Chris Daley, Rahul Gayatri, Nan Ding, Lenny Oliker, Hai Ah Nam, Nicholas J. Wright, and Sam Williams (Lawrence Berkeley National Laboratory (LBNL))
Abstract: Tighter integration of computational resources can foster superior application performance by mitigating communication bottlenecks. Unfortunately, not every application can use every compute or accelerator all the time. As a result, co-locating resources often leads to under-utilization of resources. In the next five years, HPC system architects will be presented with a spectrum of accelerated solutions ranging from tightly coupled, single package APUs to a sea of disaggregated GPUs interconnected by a global network. In this paper, we detail NEthing, our methodology and tool for evaluating the potential performance implications of such diverse architectural paradigms. We demonstrate our methodology on today’s and projected 2026 technologies for three distinct workloads: a compute-intensive kernel, a tightly-coupled HPC simulation, and an ensemble of loosely-coupled HPC simulations. Our results leverage NEthing to quantify the increased utilization disaggregated systems must achieve in order to match superior performance of APUs and on-board GPUs.