SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research Posters Archive

Exploring Performance of GeoCAT data analysis routines on GPUs

Authors: Haniye Kashgarani (University of Wyoming) and Cena Miller, Supreeth Suresh, and Anissa Zacharias (National Center for Atmospheric Research (NCAR))

Abstract: The GeoCAT-comp program is a Python toolkit used by the geoscience community to analyze data. This project explores ways to port GeoCAT-comp to run on GPUs, as recent supercomputers are shifting to include GPU accelerators as the major resource. Although GeoCAT-comp's routines are all sequential or utilize Dask parallelization on the CPU, the data processing is embarrassingly parallel and computationally costly, enabling us to optimize using GPUs. GeoCAT uses NumPy, Xarray, and Dask arrays for CPU parallelization. In this project, we examined different GPU-accelerated Python packages (e.g., Numba and CuPy). Taking into account the deliverability of the final porting method to the GeoCAT team, CuPy is selected. CuPy is a Python CUDA-enabled array backend module that is quite similar to NumPy. We analyzed the performance of the GPU-accelerated code compared to the Dask CPU parallelized code over various array sizes and resources, and through strong and weak scaling.

Best Poster Finalist (BP): no

Poster: PDF
Poster summary: PDF

Back to Poster Archive Listing