Student: Mikhail Isaev (Georgia Institute of Technology)
Supervisor: Richard Vuduc (Georgia Institute of Technology)
Abstract: ParaGraph is an open-source toolkit for use in co-designing hardware and software for supercomputer-scale systems. It bridges an infrastructure gap between an application target and existing high-fidelity computer-network simulators. The first component of ParaGraph is a high-level graph representation of a parallel program, which faithfully represents parallelism and communication, can be extracted automatically from a compiler, and is “tuned” for use with network simulators. The second is a runtime that can emulate the representation’s dynamic execution for a simulator. User-extensible mechanisms are available for modeling on-node performance and transforming high-level communication into operations that backend simulators understand. Case studies include deep learning workloads that are extracted automatically from programs written in JAX and TensorFlow and interfaced with several event-driven network simulators. These studies show how system designers can use ParaGraph to build flexible end-to-end software-hardware co-design workflows to tweak communication libraries, find future hardware bottlenecks, and validate simulations with traces.
ACM-SRC Semi-Finalist: no
Poster: PDF
Poster Summary: PDF
Back to Poster Archive Listing