SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

AppEKG: A Simple Unifying View of HPC Applications in Production


Workshop: PMBS22: The 13th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

Authors: Mohammad Al-Tahat, Strahinja Trecakov, and Jonathan Cook (New Mexico State University)


Abstract: While many good development-oriented tools exist for analyzing and improving the performance of HPC applications, capability for capturing and analyzing the dynamic behavior of application in real production runs is lacking. Many heavily-used applications do keep some internal metrics of their performance, but there is no unified way of using these. In this paper we present the initial idea of AppEKG, both a concept of and a prototype tool for providing a unified, understandable view of HPC application behavior in production. Our prototype AppEKG framework can achieve less than 1% overhead, thus usable in production, and still provide dynamic data collection that captures time-varying runtime behavior.





Back to PMBS22: The 13th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing



Back to Full Workshop Archive Listing