(RP27) Memory First : A Performance Tuning Strategy Focusing on Memory Access Patterns
Performance Analysis and Optimization
Scientific Software Development
TimeTuesday, June 18th8:30am - 10am
DescriptionAs many scientific applications are memory-bound, a key to achieving high sustained performance on a modern HPC system is to fully exploit the system’s memory bandwidth. Indeed, the sustained memory bandwidth of an application could be much lower than the theoretical peak bandwidth of the system for various reasons due to underlying memory architectures. For example, a certain memory access pattern may cause frequent access conflicts at memory channels and/or banks, and thus lead to a longer access latency. This poster hence discusses a memory-centric performance tuning strategy, Memory First. Usually, a whole application is first written and then optimized for a particular system, often resulting in major code modifications for memory-aware tuning. On the other hand, in the Memory First strategy, memory access patterns capable of achieving high sustained memory bandwidths on a target system are first investigated. Then, unlike the conventional strategy, a tiny benchmark code achieving a high sustained memory bandwidth is developed, while keeping a target application’s behavior in mind. Finally, the code is modified to work as the target application. While application coding is well established in matured application areas such as CFD, memory-aware tuning is likely to become more painful in practice. This is because the latter has to be developed for every new architecture in a try-and-error fashion. Therefore, giving a higher priority to memory-aware tuning can result in a lower tuning cost in modern HPC systems with advanced memory technologies, such as NEC SX-Aurora TSUBASA.