Authors: Xiao Wang, Aristeidis Tsaris, and Debangshu Mukherjee (Oak Ridge National Laboratory (ORNL)); Mohamed Wahib (RIKEN Center for Computational Science (R-CCS)); Peng Chen (National Institute of Advanced Industrial Science and Technology (AIST), Japan); and Mark Oxley, Olga Ovchinnikova, and Jacob Hinkle (Oak Ridge National Laboratory (ORNL))
Abstract: Ptychography is a popular microscopic imaging modality and sets the record for the highest image resolution. Unfortunately, the high image resolution requires significant amount of memory and computation, forcing many applications to compromise their image resolution in exchange for a smaller memory footprint and a shorter reconstruction time. In this paper, we propose a novel image gradient decomposition method that significantly reduces memory footprint by tessellating image gradients and measurements into tiles. In addition, we propose a parallel decomposition method that enables asynchronous point-to-point communications and pipelining with minimal parallel overhead. Our experiments on a large-scale Titanate material dataset show that the Gradient Decomposition reduces memory footprint by 51 times and achieves time-to-solution in 2.2 minutes by scaling to 4158 GPUs with a super-linear speedup at 364% efficiency. This performance is 2.7 times more memory efficient, 9 times more scalable, and 86 times faster than the state-of-the-art algorithm.
Back to Technical Papers Archive Listing