SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Technical Papers Archive

Vectorizing Sparse Matrix Computations with Partially-Strided Codelets

Authors: Kazem Cheshmi (University of Toronto; McMaster University, Ontario, Canada) and Zachary Cetinic and Maryam Mehri Dehnavi (University of Toronto)

Abstract: The compact data structures and irregular computation patterns in sparse matrix computations introduce challenges to vectorizing these codes. Available approaches primarily vectorize strided computation regions of a sparse code. In this work, we propose a locality-based codelet mining (LCM) algorithm that efficiently searches for strided and partially strided regions in sparse matrix computations for vectorization. We also present a classification of partially strided codelets and a differentiation-based approach to generate codelets from memory accesses in the sparse computation. LCM is implemented as an inspector-executor framework called LCM I/E that generates vectorized code for the sparse matrix-vector multiplication (SpMV), sparse matrix times dense matrix (SpMM), and sparse triangular solver (SpTRSV). LCM I/E outperforms the MKL library with an average speedup of 1.67X, 4.1X, and 1.75X for SpMV, SpTRSV, and SpMM, respectively. It is also faster than the state-of-the-art inspector-executor framework Sympiler for the SpTRSV kernel with an average speedup of 1.9X.

Back to Technical Papers Archive Listing