2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFTWei Hu (University of Science and Technology of China); Hong An (University of Science and Technology of China; Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Zhuoqiang Guo (Institute of Computing Technology, Chinese Academy of Sciences); Qingcai Jiang and Xinming Qin (University of Science and Technology of China); Junshi Chen (University of Science and Technology of China; Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Weile Jia (Institute of Computing Technology, Chinese Academy of Sciences); Chao Yang (Peking University); Zhaolong Luo, Jielan Li, and Wentiao Wu (University of Science and Technology of China); Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences); Dongning Jia (Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Qinglin Lu and Fangfang Liu (Institute of Software, Chinese Academy of Sciences); Min Tian (Qilu University of Technology, Shandong, China); Fang Li (National Research Center of Parallel Computer Engineering and Technology, China); and Yeqi Huang, Liyi Wang, Sha Liu, and Jinlong Yang (University of Science and Technology of China)
Accelerating Elliptic Curve Digital Signature Algorithms on GPUsZonghao Feng and Qipeng Xie (Hong Kong University of Science and Technology); Qiong Luo (Hong Kong University of Science and Technology; Hong Kong University of Science and Technology, Guangzhou); and Yujie Chen, Haoxuan Li, Huizhong Li, and Qiang Yan (WeBank, China)
Accelerating Parallel Write via Deeply Integrating Predictive Lossy Compression with HDF5Sian Jin and Dingwen Tao (Indiana University); Houjun Tang (Lawrence Berkeley National Laboratory (LBNL)); Sheng Di (Argonne National Laboratory (ANL)); Suren Byna and Zarija Lukić (Lawrence Berkeley National Laboratory (LBNL)); and Franck Cappello (Argonne National Laboratory (ANL), University of Illinois)
Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct SolversAhmad Abdelfattah (University of Tennessee, Innovative Computing Laboratory (ICL)); Pieter Ghysels and Wajih Boukaram (Lawrence Berkeley National Laboratory (LBNL)); Stanimire Tomov (University of Tennessee, Innovative Computing Laboratory (ICL)); Xiaoye Li (Lawrence Berkeley National Laboratory (LBNL)); and Jack Dongarra (University of Tennessee, Innovative Computing Laboratory (ICL))
AI for Quantum Mechanics: High Performance Quantum Many-Body Simulations via Deep LearningXuncheng Zhao, Mingfan Li, and Qian Xiao (University of Science and Technology of China (USTC)); Junshi Chen (University of Science and Technology of China (USTC); Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Fei Wang (Tsinghua University, China); Li Shen (University of Science and Technology of China (USTC)); Meijia Zhao and Wenhao Wu (National Supercomputing Center in Wuxi); Hong An (University of Science and Technology of China (USTC); Pilot National Laboratory for Marine Science and Technology, Qingdao, China); and Lixin He and Xiao Liang (University of Science and Technology of China (USTC))
AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse MatricesZhen Du (Institute of Computing Technology, Chinese Academy of Sciences; Chinese Academy of Sciences); Jiajia Li (North Carolina State University); and Yinshan Wang, Xueqi Li, Guangming Tan, and Ninghui Sun (Institute of Computing Technology, Chinese Academy of Sciences)
Approximate Computing Through the Lens of Uncertainty QuantificationKonstantinos Parasyris, James Diffenderfer, Harshitha Menon, and Ignacio Laguna (Lawrence Livermore National Laboratory); Jackson Vanover (University of California, Davis); and Ryan Vogt and Daniel Osei-Kuffuor (Lawrence Livermore National Laboratory)
CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUsQingxiao Sun, Yi Liu, Hailong Yang, Ruizhe Zhang, Ming Dun, Mingzhen Li, and Xiaoyan Liu (Beihang University); Wencong Xiao and Yong Li (Unaffiliated); and Zhongzhi Luan and Depei Qian (Beihang University)
Dynamic Quality Metric Oriented Error Bounded Lossy Compression for Scientific DatasetsJinyang Liu (University of California, Riverside; Argonne National Laboratory (ANL)); Sheng Di (Argonne National Laboratory (ANL)); Kai Zhao (University of Alabama, Birmingham); Xin Liang (University of Kentucky); Zizhong Chen (University of California, Riverside); and Franck Cappello (Argonne National Laboratory (ANL))
Exaflops Biomedical Knowledge Graph AnalyticsRamakrishnan Kannan, Piyush Sao, and Hao Lu (Oak Ridge National Laboratory (ORNL)); Jakub Kurzak (Advanced Micro Devices (AMD) Inc); Gundolf Schenk and Yongmei Shi (University of California, San Francisco); Seung-Hwan Lim (Oak Ridge National Laboratory (ORNL)); Sharat Israni (University of California, San Francisco); Vijay Thakkar (Georgia Institute of Technology); Guojing Cong and Robert Patton (Oak Ridge National Laboratory (ORNL)); Sergio Baranzini (University of California, San Francisco); Richard Vuduc (Georgia Institute of Technology); and Thomas Potok (Oak Ridge National Laboratory (ORNL))
Extreme Scale Earthquake Simulation with Uncertainty QuantificationTsuyoshi Ichimura and Kohei Fujita (University of Tokyo, RIKEN); Ryota Kusakabe (University of Tokyo); Kentaro Koyama (Fujitsu Ltd); Sota Murakami and Yuma Kikuchi (University of Tokyo); Takane Hori and Muneo Hori (Japan Agency for Marine-Earth Science and Technology); Hikaru Inoue, Takafumi Nose, and Takahiro Kawashima (Fujitsu Ltd); and Lalith Maddegedara (University of Tokyo)
Extreme-Scale Many-against-Many Protein Similarity SearchOguz Selvitopi (Lawrence Berkeley National Laboratory (LBNL)); Saliya Ekanayake (Microsoft Corporation); Giulia Guidi (University of California, Berkeley); Muaaz Awan (National Energy Research Scientific Computing Center (NERSC)); Georgios Pavlopoulos (Biomedical Sciences Research Center (BSRC), Greece); Ariful Azad (Indiana University); Nikos Kyrpides (US Department of Energy Joint Genome Institute); Leonid Oliker (Lawrence Berkeley National Laboratory (LBNL)); Katherine Yelick (University of California, Berkeley; Lawrence Berkeley National Laboratory (LBNL)); and Aydin Buluç (Lawrence Berkeley National Laboratory (LBNL); University of California, Berkeley)
From Correctable Memory Errors to Uncorrectable Memory Errors: What Error Bits TellCong Li (Intel Corporation), Yu Zhang (ByteDance Ltd), Jialei Wang and Hang Chen (Intel Corporation), Xian Liu (ByteDance Ltd), Tai Huang (Intel Corporation), Liang Peng (ByteDance Ltd), Shen Zhou (Intel Corporation), and Lixin Wang and Shijian Ge (ByteDance Ltd)
GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamicsMaxim Zvyagin (Argonne National Laboratory (ANL)); Alexander Brace (Argonne National Laboratory (ANL), University of Chicago); Kyle Hippe (Argonne National Laboratory (ANL)); Yuntian Deng (NVIDIA Corporation, Harvard University); Bin Zhang and Cindy Bohorquez (Cerebras Systems); Austin Clyde (Argonne National Laboratory (ANL), University of Chicago); Bharat Kale (Northern Illinois University); Danilo Perez-Rivera (Argonne National Laboratory (ANL), New York University (NYU)); Heng Ma (Argonne National Laboratory (ANL)); Carla M. Mann (Argonne National Laboratory (ANL), University of Chicago); Michael Irvin (Argonne National Laboratory (ANL)); J. Gregory Pauloski (University of Chicago); Logan Ward (Argonne National Laboratory (ANL)); Valerie Hayot-Sasson (Argonne National Laboratory (ANL), University of Chicago); Murali Emani, Sam Foreman, and Zhen Xie (Argonne National Laboratory (ANL)); Diangen Lin and Maulik Shukla (Argonne National Laboratory (ANL), University of Chicago); Weili Nie and Josh Romero (NVIDIA Corporation); Christian Dallago (NVIDIA Corporation, Technical University Munich); Arash Vahdat (NVIDIA Corporation); Chaowei Xiao (Arizona State University, NVIDIA Corporation); Thomas Gibbs (NVIDIA Corporation); Ian Foster and James J. Davis (Argonne National Laboratory (ANL), University of Chicago); Michael Papka (Argonne National Laboratory (ANL); University of Illinois, Chicago); Thomas Brettin (Argonne National Laboratory (ANL)); Rick Stevens (Argonne National Laboratory (ANL), University of Chicago); Anima Anandkumar (NVIDIA Corporation, California Institute of Technology); and Venkatram Vishwanath and Arvind Ramanathan (Argonne National Laboratory (ANL))
A GPU-Accelerated AMR Solver for Gravitational Wave PropagationMilinda Fernando (University of Texas, Oden Institute); David Neilsen and Eric Hirschmann (Brigham Young University); Yosef Zlochower (Rochester Institute of Technology); Hari Sundar (University of Utah); and Omar Ghattas and George Biros (University of Texas, Oden Institute)
HammingMesh: A Network Topology for Large-Scale Deep LearningTorsten Hoefler (ETH Zürich, Microsoft Corporation); Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, and Shigang Li (ETH Zürich); and Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, and Steve Scott (Microsoft Corporation)
Image Gradient Decomposition for Parallel and Memory-Efficient Ptychographic ReconstructionXiao Wang, Aristeidis Tsaris, and Debangshu Mukherjee (Oak Ridge National Laboratory (ORNL)); Mohamed Wahib (RIKEN Center for Computational Science (R-CCS)); Peng Chen (National Institute of Advanced Industrial Science and Technology (AIST), Japan); and Mark Oxley, Olga Ovchinnikova, and Jacob Hinkle (Oak Ridge National Laboratory (ORNL))
Large-Scale Simulation of Quantum Computational Chemistry on a New Sunway SupercomputerHonghui Shang (Institute of Computing Technology, Chinese Academy of Sciences); Li Shen (University of Science and Technology of China, National Supercomputing Center in Wuxi); Yi Fan (University of Science and Technology of China); Zhiqian Xu (Institute of Computing Technology, Chinese Academy of Sciences); Chu Guo (Shanghai Research Center for Quantum Sciences); Jie Liu (University of Science and Technology of China); Wenhao Zhou (National Supercomputing Center in Wuxi); Huan Ma (University of Science and Technology of China); Rongfen Lin (Tsinghua University, China); Yuling Yang and Fang Li (National Supercomputing Center in Wuxi); Zhuoya Wang (Pilot National Laboratory for Marine Science and Technology, Qingdao, China); Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); and Zhenyu Li (University of Science and Technology of China)
LightSeq2: Accelerated Training for Transformer-Based Models on GPUsXiaohui Wang, Yang Wei, and Ying Xiong (ByteDance Ltd, AI Lab); Guyue Huang (University of California, Santa Barbara); Xian Qian (ByteDance Ltd, AI Lab); Yufei Ding (University of California, Santa Barbara); Mingxuan Wang (ByteDance Ltd, AI Lab); and Lei Li (University of California, Santa Barbara)
Mapping Out the HPC Dependency ChaosFarid Zakaria (University of California, Santa Cruz); Thomas Scogland and Todd Gamblin (Lawrence Livermore National Laboratory); and Carlos Maltzahn (University of California, Santa Cruz)
Memory Optimizations in an Array LanguagePhilip Munksgaard and Troels Henriksen (University of Copenhagen), Ponnuswamy Sadayappan (University of Utah), and Cosmin Oancea (University of Copenhagen)
MetaWBC: POSIX-Compliant Metadata Write-Back Caching for Distributed File SystemsYingjin Qian (DataDirect Networks (DDN)); Wen Cheng (Huazhong University of Science and Technology (HUST)); Lingfang Zeng (Zhejiang Lab); Marc-André Vef (Johannes Gutenberg University Mainz); Oleg Drokin and Andreas Dilger (Whamcloud Inc); Shuichi Ihara (DataDirect Networks (DDN)); Wusheng Zhang (Tsinghua University, China); Yang Wang (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences); and André Brinkmann (Johannes Gutenberg University Mainz)
Optimization of Full-Core Reactor Simulations on SummitMisun Min (Argonne National Laboratory (ANL)); Yu-Hsiang Lan (University of Illinois); Paul Fischer (University of Illinois, Argonne National Laboratory (ANL)); Elia Merzari (Pennsylvania State University, Argonne National Laboratory (ANL)); Stefan Kerkemeier (Argonne National Laboratory (ANL)); Malachi Phillips and Thilina Rathnayake (University of Illinois); April Novak (Argonne National Laboratory (ANL)); Derek Gaston (Idaho National Laboratory); Noel Chalmers (AMD Research); and Tim Warburton (Virginia Tech)
Parla: A Python Orchestration System for Heterogeneous ArchitecturesHochan Lee and William Ruys (University of Texas); Ian Henriksen (University of Texas, Jabberwock Technologies Inc); Arthur Peters (University of Texas, Katana Graph Inc); Yineng Yan, Sean Stephens, Bozhi You, and Henrique Fingler (University of Texas); Martin Burtscher (Texas State University); Milos Gligoric (University of Texas); Karl Schulz and Keshav Pingali (University of Texas, Oden Institute); Christopher J. Rossbach and Mattan Erez (University of Texas); and George Biros (University of Texas, Oden Institute)
PolarFly: A Cost-Effective and Flexible Low-Diameter TopologyKartik Lakhotia (Intel Corporation, Intel Labs); Maciej Besta (ETH Zürich); Laura Monroe (Los Alamos National Laboratory (LANL)); Kelly Isham (Colgate University); Patrick Iff and Torsten Hoefler (ETH Zürich); and Fabrizio Petrini (Intel Corporation)
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set RepresentationsMaciej Besta (ETH Zürich); Cesare Miglioli (University of Geneva, Switzerland); Paolo Sylos Labini (Free University of Bozen-Bolzano, Italy); Jakub Tětek (University of Copenhagen); Patrick Iff (ETH Zürich); Raghavendra Kanakagiri (University of Illinois); Saleh Ashkboos (ETH Zürich); Kacper Janda (AGH University of Science and Technology, Krakow, Poland); Michal Podstawski (Warsaw University of Technology); Grzegorz Kwasniewski and Niels Gleinig (ETH Zürich); Flavio Vella (University of Trento, Italy); and Onur Mutlu and Torsten Hoefler (ETH Zürich)
Productive Performance Engineering for Weather and Climate Modeling with PythonTal Ben-Nun (ETH Zürich); Linus Groner (Swiss National Supercomputing Centre (CSCS)); Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver D. Elbert, Rhea George, and Jeremy McGibbon (Allen Institute for Artificial Intelligence); Lukas Trümper (ETH Zürich); Elynn Wu and Oliver Fuhrer (Allen Institute for Artificial Intelligence); Thomas Schulthess (Swiss National Supercomputing Centre (CSCS)); and Torsten Hoefler (ETH Zürich)
Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class SupercomputersLuca Fedeli (University of Paris-Saclay); Axel Huebl (Lawrence Berkeley National Laboratory (LBNL)); France Boillod-Cerneux and Thomas Clark (University of Paris-Saclay); Kevin Gott (Lawrence Berkeley National Laboratory (LBNL)); Conrad Hillairet (ARM Ltd); Stephan Jaure (Bull Atos Technologies); Adrien Leblanc (National Institute of Advanced Technology (ENSTA Paris)); Rémi Lehe and Andrew Myers (Lawrence Berkeley National Laboratory (LBNL)); Christelle Piechurski (GENCI, France); Mitsuhisa Sato (RIKEN); Neil Zaïm (University of Paris-Saclay); Weiqun Zhang and Jean-Luc Vay (Lawrence Berkeley National Laboratory (LBNL)); and Henri Vincenti (University of Paris-Saclay)
ReSemble: Reinforced Ensemble Framework for Data PrefetchingPengmiao Zhang (University of Southern California (USC)); Rajgopal Kannan (United States Army Research Laboratory, University of Southern California (USC)); Ajitesh Srivastava (University of Southern California (USC)); Anant V. Nori (Intel Corporation); and Viktor K. Prasanna (University of Southern California (USC))
Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental ApplicationsQinglei Cao (University of Tennessee, Innovative Computing Laboratory); Sameh Abdulah and Rabab Alomairy (King Abdullah University of Science and Technology (KAUST)); Yu Pei (University of Tennessee, Innovative Computing Laboratory); Pratik Nag (King Abdullah University of Science and Technology (KAUST)); George Bosilca (University of Tennessee, Innovative Computing Laboratory); Jack Dongarra (University of Tennessee, Innovative Computing Laboratory; Oak Ridge National Laboratory (ORNL)); and Marc Genton, David Keyes, Hatem Ltaief, and Ying Sun (King Abdullah University of Science and Technology (KAUST))
Running Ahead of Evolution - AI Based Simulation for Predicting Future High-Risk SARS-CoV-2 VariantsJie Chen (Peng Cheng Laboratory; School of Electronic and Computer Engineering, Peking University); Zhiwei Nie (School of Electronic and Computer Engineering, Peking University; Peng Cheng Laboratory); Yu Wang, Kai Wang, Fan Xu, Zhennan Wang, Guoli Song, Xiansong Huang, and Zhixiang Ren (Peng Cheng Laboratory); Bin Zhou (School of Information Science and Engineering, Shandong University); Chao Yang (School of Mathematical Sciences, Peking University); and Yonghong Tian (Peng Cheng Laboratory; School of Electronic and Computer Engineering, Peking University)
Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler AugmentationWilliam Moses (Massachusetts Institute of Technology (MIT)); Sri Hari Krishna Narayanan (Argonne National Laboratory (ANL)); Ludger Paehler (Technical University Munich); Valentin Churavy (Massachusetts Institute of Technology (MIT)); and Michel Schanen, Jan Hueckelheim, Johannes Doerfert, and Paul Hovland (Argonne National Laboratory (ANL))
Scalable Irregular Parallelism with GPUs: Getting CPUs Out of the WayYuxin Chen (University of California, Davis); Benjamin Brock (University of California, Berkeley); Serban Porumbescu (University of California, Davis); Aydin Buluc (Lawrence Berkeley National Laboratory (LBNL)); Katherine Yelick (University of California, Berkeley); and John Owens (University of California, Davis)
Scaling Correlated Fragment Molecular Orbital Calculations on SummitGiuseppe Barca and Calum Snowdon (Australian National University); Jorge Galvez-Vallejo (Iowa State University); Fazeleh Kazemian (Australian National University); Alistair Rendell (Flinders University, Australia); and Mark S. Gordon (Iowa State University)
SFS: Smart OS Scheduling for Serverless FunctionsYuqi Fu (University of Virginia), Liu Li (George Mason University (GMU)), Haoliang Wang (Adobe Research), Yue Cheng (University of Virginia), and Songqing Chen (George Mason University (GMU))
Symmetric Block-Cyclic Distribution: Fewer Communications Leads to Faster Dense Cholesky FactorizationOlivier Beaumont (French Institute for Research in Computer Science and Automation (INRIA)); Philippe Duchon (LaBRI, France); Lionel Eyraud-Dubois (French Institute for Research in Computer Science and Automation (INRIA)); Julien Langou (University of Colorado, Denver); and Mathieu Verite (French Institute for Research in Computer Science and Automation (INRIA))
A Taxonomy of Error Sources in HPC I/O Machine Learning ModelsMihailo Isakov, Mikaela Currier, and Eliakin del Rosario (Arizona State University); Sandeep Madireddy, Prasanna Balaprakash, Philip H. Carns, and Robert B. Ross (Argonne National Laboratory (ANL)); Glenn K. Lockwood (Lawrence Berkeley National Laboratory (LBNL)); and Michel A. Kinsy (Arizona State University)
Toward Scalable Resource Management for SupercomputersYiqin Dai, Yong Dong, Kai Lu, Ruibo Wang, Wei Zhang, Juan Chen, and Mingtian Shao (National University of Defense Technology (NUDT), China) and Zheng Wang (University of Leeds)
Using Answer Set Programming for HPC Dependency SolvingTodd Gamblin (Lawrence Livermore National Laboratory), Massimiliano Culpo (Np-Complete S.r.l.), and Gregory Becker and Sergei Shudler (Lawrence Livermore National Laboratory)
Using Unused: Non-Invasive Dynamic FaaS Infrastructure with HPC-WhiskBartłomiej Przybylski (Institute of Informatics, University of Warsaw, Poland); Maciej Pawlik (AGH University of Science and Technology, Krakow, Poland; Academic Computer Centre Cyfronet AGH, Krakow, Poland); Paweł Żuk (Institute of Informatics, University of Warsaw, Poland); Bartłomiej Łagosz (AGH University of Science and Technology, Krakow, Poland); Maciej Malawski (Sano Centre for Computational Medicine, Krakow, Poland; AGH University of Science and Technology, Krakow, Poland); and Krzysztof Rzadca (Institute of Informatics, University of Warsaw, Poland)
vGraph: Memory-Efficient Multicore Graph Processing for Traversal-Centric AlgorithmsMenghan Jia (National University of Defense Technology (NUDT), China); Yiming Zhang (Xiamen University; National University of Defense Technology (NUDT), China); Xinbiao Gan and Dongsheng Li (National University of Defense Technology (NUDT), China); Erci Xu (Xiamen University; National University of Defense Technology (NUDT), China); and Ruibo Wang and Kai Lu (National University of Defense Technology (NUDT), China)
W-Cycle SVD: A Multilevel Algorithm for Batched SVD on GPUsJunmin Xiao, Yunfei Pang, Qing Xue, and Chaoyang Shui (Institute of Computing Technology, Chinese Academy of Sciences); Ke Meng (Alibaba Group); and Hui Ma, Mingyi Li, Xiaoyang Zhang, and Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences)