(PP23): MLS: Multilevel Scheduling in Large Scale High Performance Computers
Performance Analysis and Optimization
TimeTuesday, June 18th3:15pm - 3:45pm
DescriptionHigh performance computing systems are of increased size (in terms of node count, core count, and core types per node), resulting in increased available hardware parallelism. Hardware parallelism can be found at several levels, from machine instructions to global computing sites. Unfortunately, exposing, expressing, and exploiting parallelism is difficult when considering the increase in parallelism within each level and when exploiting more than a single or even a couple of parallelism levels.
The multilevel scheduling (MLS) project aims to offer an answer to the following research question: Given massive parallelism, at multiple levels, and of diverse forms and granularities, how can it be exposed, expressed, and exploited such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained?
The MLS project investigates the development of a multilevel approach for achieving scalable scheduling in large scale high performance computing systems across the multiple levels of parallelism, with a focus on software parallelism. By integrating multiple levels of parallelism, MLS differs from hierarchical scheduling, traditionally employed to achieve scalability within a single level of parallelism.
Specifically, MLS extends and bridges the most successful (batch, application, and thread) scheduling models beyond a single or a couple of parallelism levels (scaling across) and beyond their current scale (scaling out).
Via the MLS approach, the project aims to leverage all available parallelism and address hardware heterogeneity in large scale high performance computers such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained.