Authors: Pawel Gepner (Graphcore, Warsaw University of Technology), Hatem Ltaief (King Abdullah University of Science and Technology (KAUST)), Łukasz Anaczkowski (Graphcore), Hubert Chrzaniuk (Graphcore)
Abstract: We explore the possibilities of a hybrid system capable of solving both HPC and AI scientific problems. Such a hybrid architecture combines the synergism between classical HPC platforms and dedicated AI chip systems, which is important due to the computational challenges brought to the fore by massively parallel Exascale systems.
We discuss the system functionality, the algorithmic software adaptations, and performance considerations. We present efforts in supporting AI/ML applications in addition to seismic imaging, climate/weather prediction, and computational astronomy on hybrid systems. In particular, we investigate how Graphcore’s IPU can accelerate hybrid HPC applications, beyond the originally intended AI workloads.
Long Description: HPC is undergoing a transformational change. On the one hand, the traditional scientific applications are now increasingly accelerated using AI methods running on innovative hardware such as the IPU. On the other hand, there are several efforts in mapping traditional HPC workloads onto customised AI chips. The challenge lies in integrating AI methods and hardware into the HPC workflow so that it is complementary, efficient and offers real benefits. In this BoF, we explore the possibilities of a hybrid system architecture capable of solving both HPC and AI scientific problems. Such an architecture would need to utilize classical HPC platforms and dedicated AI chip systems, which is particularly important in the context of the computational challenges presented by new types of Exascale systems. Beyond the implications at the chip level, we also discuss the system functionality, the algorithmic software adaptation, and performance considerations. Through exploring the potential of this heterogeneous system, we can invent a system capable of accelerating the most complex simulation problems. This system can be built using customised AI chips and systems, including Graphcore’s IPU technology, to execute new hybrid algorithmic accelerations. We focus on the details of this synergistic system architecture and assess its benefits, challenges, and programming model.
HPC and AI historically and practically evolved along different swim lanes and therefore they have different sets of requirements. The BoF will start by igniting discussions with setting contexts on HPC and AI workloads and listing the differences and unique qualities of each trend. Having the spirit of both areas, we will show how those two different approaches to handling computations and data movement can potentially be used within one experiment in a synergistic way. We will get closer to practical details of implementation and along with the panel of experts we will consider possible hardware infrastructure design on a level of cluster. We will also think about a software stack that would allow the researchers and early adopters to utilize all the hardware of such a hybrid system efficiently and conveniently. At this point we will also present possibilities of using a new type of accelerator like IPU and we will introduce the concept of scaling out the IPU processors with IPU POD systems which are targeting bigger hybrid installations.
Having all the elements of this vision defined, we will present several experiments that were conducted along with our clients and partners that show the potential of using both HPC and IPU-based AI in scope of one experiment to tackle concrete scientific challenges. Inspired by examples along with the audience we will consider other potential use cases of envisioned heterogeneous infrastructure and presented novel approach. As the outcome of such discussion, we hope to create a spark that will in future evolve into full blown ideas on how AI could potentially revolutionise some areas allowing to achieve far more than traditional HPC ever could.
URL: https://cemse.kaust.edu.sa/hicma/news/steering-customized-ai-architectures-hpc-scientific-workloads-bird-feather-sc22
Back to Birds of a Feather Archive Listing