· Presenters · Organizations · Search
Performance Optimization of Scientific Codes with the Roofline Model
Performance Analysis and Optimization
DescriptionThe Roofline performance model offers an insightful and intuitive way to identify performance bottlenecks and guide optimization efforts, and it has been increasingly popular in the HPC community. This tutorial will strengthen the community’s Roofline knowledge and empower the community with a more automated and systematic methodology for Roofline-based analysis on both CPU and GPU architectures. It will start with an overview of the Roofline concepts and then focus on NVIDIA GPUs and present a practical methodology for Roofline data collection. With some examples, it will discuss how various characteristics such as arithmetic intensity, memory access pattern and thread divergence can be captured by the Roofline formularism on GPUs. The tutorial will then shift its focus to Intel CPUs and proceed with a hands-on, where Intel Advisor and its Roofline feature are introduced and a stencil code is used to demonstrate how Roofline can be used to guide optimization on Haswell and KNL architectures. The tutorial will conclude with a set of case studies illustrating effective usage of Roofline in real-life applications. Overall, this tutorial is a unique and novel combination of a solid methodology basis, highly practice-oriented demos and hands-on, and a representative set of open-science optimization use cases.