Workshop: PDSW22: 7th International Parallel Data Systems Workshop
Authors: Rajeev Jain (Argonne National Laboratory (ANL)), Houjun Tang (Lawrence Berkeley National Laboratory (LBNL)), Akash Dhruv (Argonne National Laboratory (ANL)), Austin Harris (Oak Ridge National Laboratory (ORNL)), and Suren Byna (Lawrence Berkeley National Laboratory (LBNL))
Abstract: Most high-fidelity physics simulation codes, such as Flash-X, need to save intermediate results (checkpoint files) to restart or gain insights into the evolution of the simulation. These simulation codes save such intermediate files synchronously, where computation is stalled while the data is written to storage. Depending on the problem size and computational requirements, this file write time can be a substantial portion of the total simulation time. In this paper, we evaluate the overheads and the overall benefit of asynchronous I/O in HDF5 to simulations. Results from real-world high-fidelity simulations on the Summit supercomputer show that I/O operation is overlapped with application communication or computation or both, effectively hiding some or all of the I/O latency. Our evaluation shows that while using asynchronous I/O adds overhead to the application, the I/O time reduction is more significant, resulting in overall up to 1.5X performance speedup
Back to PDSW22: 7th International Parallel Data Systems Workshop Archive Listing