Authors: Milinda Fernando (University of Texas, Oden Institute); David Neilsen and Eric Hirschmann (Brigham Young University); Yosef Zlochower (Rochester Institute of Technology); Hari Sundar (University of Utah); and Omar Ghattas and George Biros (University of Texas, Oden Institute)
Abstract: Simulations to calculate a single gravitational waveform (GW) can take several weeks. Yet, thousands of such simulations are needed for the detection and interpretation of gravitational waves. Future detectors will require even more accurate waveforms. Here we present the first large scale, adaptive mesh, multi-GPU numerical relativity (NR) code along with performance analysis and benchmarking. While comparisons are difficult to make, our GPU extension of the dendrogr~NR code achieves 6x speedup over existing state-of-the-art codes. We achieve 800 GFlops/s on a single NVIDIA A100 GPU with an overall 2.5x speedup over a two-socket, 128-core AMD EPYC 7763 CPU node with an equivalent CPU implementation. We present detailed performance analyses, parallel scalability results, and accuracy assessment for GWs computed for mass ratios q=1,2,4. We also present strong scalability up to 8 A100s and weak scaling up to 229,376 x86 cores on the Texas Advanced Computing Center's Frontera system.
Back to Technical Papers Archive Listing