SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Birds of a Feather Archive

Converged Computing: Bringing Together HPC and Cloud Communities

Authors: Daniel Milroy (Lawrence Livermore National Laboratory), Marquita Ellis (IBM TJ Watson Research Center), Sameer Shende (University of Oregon), Michela Taufer (University of Tennessee), Bill Magro (Google LLC), Jason Kincl (Red Hat Inc)

Abstract: Cloud computing technologies such as elastic scaling, application containerization, and container orchestration are gaining prevalence in HPC due to their benefits of resource dynamism, automation, reproducibility, and resilience. Similarly, HPC technologies for application performance optimization and sophisticated scheduling of complex resources are being integrated into modern cloud infrastructures. This trend is leading to a new domain of Converged Computing, an environment that combines the best capabilities from both worlds. In this highly-interactive BoF, we invite experts from both communities and the audience to discuss their current experiences with converged computing and share their views on its future.

Long Description: Total cloud revenue has increased by 50% from 2019 to 2021. Gartner projects that total revenue from public cloud will reach $544B by the end of 2022 and expects growth to accelerate through 2025. Research and development in cloud technologies translates the benefits of ever-increasing revenue into rapid progress. The HPC community is recognizing that the automation, elasticity, reproducibility, and resilience resulting from cloud R&D can help manage the complexity of increasingly heterogeneous systems and composite scientific workflows. On the other hand, cloud providers and researchers appreciate the decades of contributions the HPC community has made to advancing computing and are attempting to integrate them.

Elastic scaling, application containerization, and container orchestration are some cloud-native techniques gaining prevalence in HPC. Cloud providers are hard at work to increase performance and efficiency of their infrastructure and translate improvements to hosted applications. We anticipate cloud and HPC technologies will further integrate into a converged computing environment which combines the best capabilities of both worlds. This BoF has not been held before at SC and is introducing a new area of collaboration.

In the first half of this BoF, we invite public cloud providers and researchers to present summaries of their work toward converged computing and will encourage Q&A. We will invite the cloud providers and researchers to present their views on the future of converged computing based on motivating use cases and challenges and encourage audience interaction. The second half of the session will direct key questions to the audience.

1. What will be the role of public and private clouds in the future? 2. How can we study and understand performance in a converged environment? What tools and capabilities will be useful for profiling, tracing, and debugging? 3. Cloud generally features declarative management which simplifies automation. HPC is closer to an imperative design where performance-oriented users specify exactly where and how applications run. How can we simplify automation and performance in a converged environment? Which model will work best? What blockers or gaps prevent converged computing from seamlessly operating both? 4. How do we train users with cloud or HPC backgrounds to do work on converged systems?

We will conclude the session by community building: planning for future converged computing sessions and discussing workshop hosting.


Back to Birds of a Feather Archive Listing