Vectorizing Sparse Matrix Computations with Partially-Strided Codelets
DescriptionThe compact data structures and irregular computation patterns in sparse matrix computations introduce challenges to vectorizing these codes. Available approaches primarily vectorize strided computation regions of a sparse code. In this work, we propose a locality-based codelet mining (LCM) algorithm that efficiently searches for strided and partially strided regions in sparse matrix computations for vectorization. We also present a classification of partially strided codelets and a differentiation-based approach to generate codelets from memory accesses in the sparse computation. LCM is implemented as an inspector-executor framework called LCM I/E that generates vectorized code for the sparse matrix-vector multiplication (SpMV), sparse matrix times dense matrix (SpMM), and sparse triangular solver (SpTRSV). LCM I/E outperforms the MKL library with an average speedup of 1.67X, 4.1X, and 1.75X for SpMV, SpTRSV, and SpMM, respectively. It is also faster than the state-of-the-art inspector-executor framework Sympiler for the SpTRSV kernel with an average speedup of 1.9X.
Event Type
TimeTuesday, 15 November 20224pm - 4:30pm CST
Registration Categories
System Software
Reproducibility Badges
Session Formats
Back To Top Button