Student: Anna Fortenberry (University of North Texas)
Supervisor: Stanimire Tomov (University of Tennessee)
Abstract: The architectures of supercomputers are increasing in diversity. It is important to maintain efficient code portability to take advantage of the computing capabilities of the evolving hardware in these systems. Intel has adopted an open standard programming interface for heterogeneous systems called oneAPI, designed to allow code portability across different processor architectures. This report evaluates oneAPI by migrating a general matrix-matrix multiplication CUDA algorithm from the dense linear algebra library Matrix Algebra on GPU and Multicore Architectures to Data Parallel C++, the direct programming language of oneAPI. Performance of the migrated code is compared to native CUDA implementations on multicore CPUs and GPUs. The initial migrated code demonstrates impressive performance on multicore CPUs. It retains the performance of CUDA on NVIDIA GPUs. It performs poorly on the Intel GPU but is improved with tuning. Intel's oneAPI allowed for a successful extension of MAGMA portability to multicore CPUs and Intel GPUs.
ACM-SRC Semi-Finalist: no
Poster Summary: PDF
Back to Poster Archive Listing