Efficiently Learning Locality Optimizations by Decomposing Transformation Domains
DescriptionAchieving full automation of program optimization is still an open problem for compiler writers. This work explores machine learning as a potential solution to learn data locality optimizations for tensor applications. Training models with supervised-learning for loop-nest optimization often requires prohibitively expensive training data generation for learning the combined effects of a transformation sequence. As a solution, this work proposes a novel learning strategy called Composed Singular Prediction (CSP) that significantly reduces the training data generation cost in the context of learned loop transformation models. The learned models are then deployed to predict data locality optimization schedules for Conv2d kernels to achieve performance improvements up to 4x against Intel oneDNN while saving over 100x in training data collection time over exhaustive search.
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
TimeTuesday, 15 November 20228:30am - 5pm CST
Registration Categories
Poster view
Back To Top Button