Authors: Wei Hu (University of Science and Technology of China); Hong An (University of Science and Technology of China; Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Zhuoqiang Guo (Institute of Computing Technology, Chinese Academy of Sciences); Qingcai Jiang and Xinming Qin (University of Science and Technology of China); Junshi Chen (University of Science and Technology of China; Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Weile Jia (Institute of Computing Technology, Chinese Academy of Sciences); Chao Yang (Peking University); Zhaolong Luo, Jielan Li, and Wentiao Wu (University of Science and Technology of China); Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences); Dongning Jia (Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Qinglin Lu and Fangfang Liu (Institute of Software, Chinese Academy of Sciences); Min Tian (Qilu University of Technology, Shandong, China); Fang Li (National Research Center of Parallel Computer Engineering and Technology, China); and Yeqi Huang, Liyi Wang, Sha Liu, and Jinlong Yang (University of Science and Technology of China)
Abstract: Over the past three decades, ab initio electronic structure calculations of large, complex and metallic systems are limited to tens of thousands of atoms in computational accuracy and efficiency on leadership supercomputers. We present a massively parallel discontinuous Galerkin density functional theory (DGDFT) implementation, which adopts adaptive local basis functions to discretize the Kohn-Sham equation, resulting in a block-sparse Hamiltonian matrix. A highly efficient pole expansion and selected inversion (PEXSI) sparse direct solver is implemented in DGDFT to achieve O(N1.5) scaling for quasi two-dimensional systems. DGDFT allows us to compute the electronic structures of complex metallic heterostructures with 2.5 million atoms (17.2 million electrons) using 35.9 million cores on the new Sunway supercomputer. The peak performance of PEXSI can achieve 64 PFLOPS (5% of theoretical peak), which is unprecedented for sparse direct solvers. This accomplishment paves the way for quantum mechanical simulations into mesoscopic scale for designing next-generation electronic devices.
Back to Technical Papers Archive Listing