Authors: Ramakrishnan Kannan, Piyush Sao, and Hao Lu (Oak Ridge National Laboratory (ORNL)); Jakub Kurzak (Advanced Micro Devices (AMD) Inc); Gundolf Schenk and Yongmei Shi (University of California, San Francisco); Seung-Hwan Lim (Oak Ridge National Laboratory (ORNL)); Sharat Israni (University of California, San Francisco); Vijay Thakkar (Georgia Institute of Technology); Guojing Cong and Robert Patton (Oak Ridge National Laboratory (ORNL)); Sergio Baranzini (University of California, San Francisco); Richard Vuduc (Georgia Institute of Technology); and Thomas Potok (Oak Ridge National Laboratory (ORNL))
Abstract: We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists of tens of millions of papers spanning decades of research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) and validate connective paths against curated biomedical knowledge graphs (e.g., SPOKE). In this context, we present COAST (Exascale Communication-Optimized All-Pairs Shortest Path) and demonstrate 1.004 EF/s on 9,200 Frontier nodes (73,600 GCDs). We develop hyperbolic performance models (HYPERMOD), which guide optimizations and parametric tuning. The proposed COAST algorithm achieved the memory constant parallel efficiency of 99% in the single-precision tropical semiring. Looking forward, COAST will enable the integration of scholarly corpora like PubMed into the SPOKE biomedical knowledge graph.
Back to Technical Papers Archive Listing