SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Methodology for Evaluating the Potential of Disaggregated Memory Systems


Workshop: 2nd International Workshop on RESource DISaggregation in High Performance Computing (RESDIS)

Authors: Nan Ding and Samuel Williams (Lawrence Berkeley National Laboratory (LBNL)); Hai Ah Nam, Taylor Groves, Muaaz Gul Awan, and Christopher Delay (National Energy Research Scientific Computing Center (NERSC)); Oguz Selvitopi and Leonid Oliker (Lawrence Berkeley National Laboratory (LBNL)); and Nicholas Wright (National Energy Research Scientific Computing Center (NERSC))


Abstract: Tightly-coupled HPC systems have rigid memory allocation and can result in expensive memory resource under-utilization. As novel memory and network technologies mature, disaggregated memory systems are becoming a promising solution for future HPC systems. It allows workloads to use the available memory of the entire system. We propose a design framework to explore the disaggregated memory system design space. The framework incorporates memory capacity, network bandwidth, and local and remote memory access ratio, and provides an intuitive approach to guide machine configurations based on technology trends and workload characteristics. We apply our framework to analyze eleven workloads from five computational scenarios, including AI training, data analysis, genomics, protein, and traditional HPC. We demonstrate the ability of our methodology to understand the potential and pitfalls of a disaggregated memory system and motivate machine configurations. Our methodology shows that 10 out of our 11 applications/workflows can leverage disaggregated memory without affecting performance.





Back to 2nd International Workshop on RESource DISaggregation in High Performance Computing (RESDIS) Archive Listing



Back to Full Workshop Archive Listing