(PhD04) Multilevel Scheduling of Computations in Large-Scale Parallel Computing Systems
System Software & Runtime Systems
TimeMonday, June 17th1pm - 6pm
DescriptionModern high performance computing (HPC) systems exhibit rapid growth in size, both “horizontally” in the number of nodes, as well as “vertically” in the number of cores per node. As such, they offer additional levels of hardware parallelism. Each such level requires and employs algorithms for appropriately scheduling the computational work at the respective level. Understanding this relation is important for improving the performance of scientific applications, that are scheduled and executed in batches on HPC systems. Understanding the relation between different levels of scheduling offers several opportunities to enhance application execution times and resource utilization as well.
This Ph.D. work focuses on two scheduling levels: batch and application level scheduling. Using simulations and native experimentation, we try to offer any an answer to the following research question: Given massive parallelism, at multiple levels, and of diverse forms and granularities, how can it be exposed, expressed, and exploited such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained?