HPC in Asia:
(RP16) Power Prediction with Probabilistic Topic Modeling for HPC
AI/Machine Learning/Deep Learning
Big Data Analytics
TimeWednesday, June 19th10:10am - 11am
DescriptionHundreds-MW power will be required for exa-scale supercomputer in 2023, so the power consumption becomes a critical factor for the next-generation systems. A power-aware scheduling with job power prediction is a key technology to achieve energy-efficient operation and high system utilization. Recently, there is a significant number of researches about predicting job power from job entries such as user-id, number of nodes by using machine learning. One challenge for making these approaches into realization is tough tuning of weights for each job entries because the weights of each job entries is different for each site. In this work, we develop the novel two-step power prediction model combining topic model and probabilistic model. The model can predict each job power from submitted job entries without manual tuning of the weight. First, all the job entries of a target job are fed to the trained topic model to derive 10 candidate jobs from the past job database. Then, the probabilistic model selects one job from the 10 candidates that has the highest probability of success and uses its power as a prediction of the target job. The probabilistic model has automatically trained how to weight these job entries based on the relationship between the past entries and the power prediction results.
We demonstrated 3-month power prediction of K computer. The average relative error with 18 % was achieved for the total job power prediction. The proposed two-step scheme has better accuracy of 3.1% in comparison with one-step, topic model only, scheme.