Workshop: 2022 International Workshop on Performance Portability and Productivity (P3HPC)
Authors: Jacob Lambert (Advanced Micro Devices (AMD) Inc), Mohammad Monil and Seyong Lee (Oak Ridge National Laboratory (ORNL)), Allen Malony (University of Oregon), and Jeffrey Vetter (Oak Ridge National Laboratory (ORNL))
Abstract: Accelerator-based heterogeneous computing is the de facto standard in current and upcoming exascale machines. These heterogeneous resources empower computational scientists to select a machine or platform well-suited to their domain or applications. However, this diversity of machines also poses challenges related to programming model selection: inconsistent availability of programming models across different exascale systems, lack of performance portability for those programming models that do span several systems, and inconsistent performance between different models on a single platform. We explore these challenges on exascale-similar hardware, including AMD MI100 and Nvidia A100 GPUs. By extending the source-to-source compiler OpenARC, we demonstrate the power of automated translation of applications written in a single front-end programming model (OpenACC) into a variety of back-end models (OpenMP, OpenCL, CUDA, HIP) that span the upcoming exascale environments. This translation enables us to compare performance within and across devices and to analyze programming model behavior with profiling tools.