Announcing the SCC Benchmark
We are excited to announce that the SC22 Reproducibility Challenge Committee has selected the Student Cluster Competition (SCC) benchmark for this year’s Reproducibility Challenge.
The honor goes to the SC21 paper “Productivity, Portability, Performance: Data-Centric Python,” by Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis, Johannes de Fine Licht, Luca Lavarini, and Torsten Hoefler from ETH Zurich, Switzerland.
A team of reviewers selected the paper from 38 accepted SC21 papers that had achieved all three badges: Artifact Available, Artifact Evaluated-Functional, and Results Reproduced on the compatibility of the code on a range of hardware, the science, and the commitment of the team after interviewing the shortlisted authors.
The authors will work with the Reproducibility Challenge Committee to create a reproducible benchmark that builds on the paper’s results. The problem size will be adjusted considering that the teams will have a smaller-scale supercomputer as part of the competition.
At SC22, the SCC teams will be asked to run the code, replicating the findings from the original paper. The code this year is based on Python and supports a variety of hardware, such as Intel, AMD, IBM POWER CPUs, NIVIDIA, AMD GPUs, Intel/Xilinc FPGAs, ARM (a64fx), etc.
Python is popular in the scientific community due to its portability and productivity. This paper explores its suitability as an HPC language, focusing on Python’s performance characteristics and ways in which it can be made more suitable to HPC. To that end, it puts to the test several frameworks that accelerate the execution of numerical workloads in Python on different hardware architectures: Numba, Pythran, CuPy, and DaCe – the authors’ own framework. (You can find more information in the repositories DaCe, NPBench, and the paper presentation.) The main artifact is NPBench, a collection of scientific benchmarks written in Python targeting those frameworks. This year’s reproducibility challenge will ask the students to work with Python as an HPC language and compete to achieve the best performance utilizing the Python toolchain of their choice.
What makes the work of the student teams particularly relevant is the replication of this paper’s work across the different clusters that will be fielded by the teams. In the era of heterogeneous computing, porting applications from one platform to another is not a simple task.
The work of the student teams at SC22 is a fantastic way to dive into reproducibility challenges across various platforms and emerge with shareable, robust insights. It is the ensemble of each team’s implementation and execution of the challenge on 16 different platforms that earned this paper ACM’s “Results Reproduced” badge in the ACM Digital Library.
Sharing is at the core of the Reproducibility Challenge, so the work of the SCC teams will be collected and published. We have already published three special issues in Parallel Computing and a few in IEEE Transactions on Parallel and Distributed Systems; one of the latest ones from SC20 SCC can be found here.
Behind the Scenes: The Selection Process
We had a diverse committee staffed with members from universities, national laboratories, and organizations from six different countries. The selected paper was chosen with the help of these 12 committee members, whose expertise was invaluable in this process and we would like to extend our appreciation to each of them.
An initial round of reviews was conducted to determine suitability for the competition. Reviewers looked at whether the finalist papers had an application that could be run by the student teams on the broad range of hardware types and cluster configurations that are typically fielded by SCC teams. This initial review eliminated over 50 percent of the potential papers, for reasons such as inability to be executed on a variety of hardware, or it was not obvious how to prepare a benchmark appropriate for SCC.
A second round of reviews, including at least two for each paper, looked at which application would be best suited for the SCC teams. To compile the overall score, we ranked the openness of the code (whether it is open source and available); the feasibility, considering the available hardware; and the accessibility of the science behind the paper to the student teams.
Finally, six papers were shortlisted. We interviewed the authors of each paper and the committee used the criteria described above to select this paper.
The selection of the paper is only one step in a long process that ends with the preparation of the Reproducibility Challenge benchmark – one of many benchmarks that the students must execute during the competition. The details of the reproducibility benchmark assignment will be revealed at SC22. Following the conference, we will publish the students’ reports from the SC22 SCC Reproducibility Challenge, to demonstrate the effectiveness of the SCC teams and their success in replicating the code on their platforms.
Mark Your Calendar
The Student Cluster Competition will be held November 14–16 during SC22 at the Kay Bailey Hutchison Convention Center in Dallas, Texas. Visit the SCC booth on the exhibit floor and chat with students about the Reproducibility Challenge. We invite you to celebrate the student participants and the authors of the selected paper at the Awards Ceremony on Thursday of the conference. And don’t miss next year’s SCC reports!
We hope you will join us in Dallas to meet these amazing students and watch them race to reproduce this benchmark and other HPC applications.