SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

ACM Student Research Competition Poster Archive

High Performance Computing Scheduler Parameter Optimization Using Simulation and Regression

Student: Scott Hutchison (Kansas State University)
Supervisor: Daniel Andresen (Kansas)

Abstract: Jobs for a High Performance Computing cluster are allocated system resources by a scheduling application such as SLURM. These scheduling applications are highly configurable by HPC administrators through the use of parameters which modify and customize their scheduling behavior. Although there are default values for these scheduling parameters provided by their creators and maintainers, it is unclear which values for scheduler parameter settings would be optimal for a particular HPC system running the types of jobs its users typically submit. Using over 37,000 jobs from historic job log data from Kansas State University’s High Performance Computing cluster, this research uses a SLURM simulator to execute over 90,000 scheduler simulations requiring over 840,000 compute hours along with gradient boosted tree regression to predict an optimal set of scheduler configuration parameters which results in a 79% decrease in the average job queue time when compared with the default scheduler parameters

ACM-SRC Semi-Finalist: no

Poster: PDF
Poster Summary: PDF

Back to Poster Archive Listing