ParaGraph: An Application-Simulator Interface and Toolkit for Hardware-Software Co-Design
DescriptionParaGraph is an open-source toolkit for use in co-designing hardware and software for supercomputer-scale systems. It bridges an infrastructure gap between an application target and existing high-fidelity computer-network simulators. The first component of ParaGraph is a high-level graph representation of a parallel program, which faithfully represents parallelism and communication, can be extracted automatically from a compiler, and is “tuned” for use with network simulators. The second is a runtime that can emulate the representation’s dynamic execution for a simulator. User-extensible mechanisms are available for modeling on-node performance and transforming high-level communication into operations that backend simulators understand. Case studies include deep learning workloads that are extracted automatically from programs written in JAX and TensorFlow and interfaced with several event-driven network simulators. These studies show how system designers can use ParaGraph to build flexible end-to-end software-hardware co-design workflows to tweak communication libraries, find future hardware bottlenecks, and validate simulations with traces.