Authors: Scott Levy (Sandia National Laboratories), Whit Schonbein (Sandia National Laboratories), Ryan Grant (Queen's University, Canada)
Abstract: The number and diversity of intelligent network devices have recently exploded. In particular, a variety of network interface cards (NICs) and data processing units (DPUs) that incorporate computational resources have recently become widely available. Examples of these new devices include Nvidia's BlueField DPUs, Xilinx's SmartNICs, and the Fungible DPU. The proliferation of these new devices has raised a number of questions regarding how best to exploit them to accelerate HPC workloads, including scientific simulations. This BoF will provide the community with an important opportunity to gather and share ideas about these promising new devices.
Long Description: The number and diversity of intelligent network devices have recently exploded. In particular, a variety network interface cards (NICs) and data processing units (DPUs) that incorporate computational resources have recently become widely available. The emergence of these new devices is due, in part, to electrical engineering concerns related to increasing network speeds that require package sizes much larger than needed by the network interface logic, leaving room for compute.
Network hardware vendors have begun releasing NICs that use this extra silicon area to build various kinds of additional compute resources (e.g., CPU cores, FPGAs, GPUs). Examples of these new devices include Nvidia's BlueField DPUs, Xilinx's SmartNICs, and the Fungible DPU. The integration of new computational resources with networking hardware presents a unique design challenge and opportunity to devise ways to exploit these resources to improve the performance for high-performance computing (HPC) workloads, including scientific simulations and AI/ML applications. Although novel hardware is starting to be more widely deployed, many questions remain about how to exploit them most effectively.
This BoF will fill an important niche at SC21. In-network processing is expected to play a critical role in facilitating high performance in future large-scale systems. Given the rapidity of the development of intelligent network hardware, this BoF will provide an opportunity for the community to share early experiences with SmartNICs, and provide feedback to SmartNIC vendors. The BoF program can benefit from this hot topic session as there is a rapidly-growing research community in SmartNICs. This BoF is an opportunity for application and middleware programmers to discuss how to exploit SmartNICs to improve the performance of HPC applications, and to discuss novel use cases. We also intend to include vendors and other experts on SmartNIC to contribute to the discussion.
This BoF was held for the first time at SC21. It was well-attended and generated lots of very thoughtful and engaging discussion. The expected outcome will be a written report summarizing the issues and ideas that were discussed in the session regarding the effective use of SmartNICs to accelerate HPC applications in addition to any new use cases that are brought up. We will also establish a Slack channel to enable participants to continue to discuss the work with SmartNICs outside of the BoF itself.
Back to Birds of a Feather Archive Listing