High Performance Computing Scheduler Parameter Optimization Using Simulation and Regression
DescriptionJobs for a High Performance Computing cluster are allocated system resources by a scheduling application such as SLURM. These scheduling applications are highly configurable by HPC administrators through the use of parameters which modify and customize their scheduling behavior. Although there are default values for these scheduling parameters provided by their creators and maintainers, it is unclear which values for scheduler parameter settings would be optimal for a particular HPC system running the types of jobs its users typically submit. Using over 37,000 jobs from historic job log data from Kansas State University’s High Performance Computing cluster, this research uses a SLURM simulator to execute over 90,000 scheduler simulations requiring over 840,000 compute hours along with gradient boosted tree regression to predict an optimal set of scheduler configuration parameters which results in a 79% decrease in the average job queue time when compared with the default scheduler parameters
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
Posters
TimeTuesday, 15 November 20228:30am - 5pm CST
LocationC1-2-3
Registration Categories
TP
XO/EX
Poster view
Back To Top Button