KokkACC: Enhancing Kokkos with OpenACC
SessionResearch Posters Display
DescriptionKokkos is a representative approach between template metaprogramming solutions that offers programmers high-level abstractions for generic programming while most of the device-specific code generation and optimizations are delegated to the compiler through template specializations. For this, Kokkos provides a set of device-specific code specializations in multiple backends, such as CUDA and HIP. However, maintaining and optimizing multiple device-specific back ends for each new device type can be complex and error-prone. To alleviate these concerns, this paper presents an alternative OpenACC back end for Kokkos: KokkACC. KokkACC provides a high-productivity programming environment and—potentially—a multi architecture back end. We have observed competitive performance; in some cases, KokkACC is faster than NVIDIA’s CUDA back end and much faster than OpenMP’s GPU offloading back end. This work also includes implementation details and a detailed performance study conducted with a set of mini-benchmarks (AXPY and DOT product) and two mini-apps (LULESH and miniFE).