SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Technical Papers Archive

CA3DMM: A New Algorithm Based on a Unified View of Parallel Matrix Multiplication


Authors: Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract: This paper presents the Communication-Avoiding 3D Matrix Multiplication (CA3DMM) algorithm, a simple and novel algorithm that has optimal or near-optimal communication cost. CA3DMM is based on a unified view of parallel matrix multiplication. Such a view generalizes 1D, 2D, and 3D matrix multiplication algorithms to reduce the data exchange volume for different shapes of input matrices. CA3DMM further minimizes the actual communication costs by carefully organizing its communication patterns. CA3DMM is much simpler than some other generalized 3D algorithms, and CA3DMM does not require low-level optimization. Numerical experiments show that CA3DMM has good parallel scalability and has similar or better performance when compared to state-of-the-art PGEMM implementations for a wide range of matrix dimensions and number of processes.


Presentation: file


Back to Technical Papers Archive Listing