Learning Neural Representations for Predicting GPU Performance
Event Type
Research Paper
AI/Machine Learning/Deep Learning
HPC Accelerators
Performance Analysis and Optimization
TimeTuesday, June 18th9:30am - 10am CEST
LocationAnalog 1, 2
DescriptionThe graphic processing units (GPUs) have become a primary source of heterogeneity in today's computing systems. With the rapid increase in number and types of GPUs available, finding the best hardware accelerator for each application is a challenge. For that matter, it is time consuming and tedious to execute every application on every GPU system to learn the correlation between application properties and hardware characteristics.
To address this problem, we extend our previously proposed collaborating filtering based modeling technique, to build an analytical model which can predict performance of applications across different GPU systems.
Our model learns representations, or embeddings (dense vectors of latent features) for applications and systems and uses them to characterize the performance of various GPU-accelerated applications.
We improve state-of-the-art collaborative filtering approach based on matrix factorization by building a multi-layer perceptron. In addition to increased accuracy in predicting application performance, we can use this model to simultaneously predict multiple metrics such as rates of memory access operations.
We evaluate our approach on a set of 30 well-known micro-applications and seven Nvidia GPUs. As a result, we can predict expected instructions per second value with 90.6% accuracy in average.