Arm’s Scalable Vector Extension: Programming Tools and Performance Analysis
Performance Analysis and Optimization
TimeSunday, June 16th9am - 1pm
DescriptionThe Scalable Vector Extension (SVE) is the next-generation SIMD instruction set for Armv8-A. SVE does not specify a vector length and it uses predicates to dynamically select the vector lanes on which an instruction operates. For many application developers, this presents an entirely new way of thinking about vectorization. This tutorial will introduce tools for SVE programming and performance analysis, and through hands-on exercises will explore the unique features of SVE and demonstrate their applicability to a range of common programing motifs. The tutorial will demonstrate how to capture high-utility information and present it in meaningful ways, such as how much time is spent in application routines, where these routines are called in the source code, and how well the routines vectorize. Programmers will be introduced to the Arm C Language Extensions, which provides a set of types and accessors for SVE vectors and predicates and a function interface for all relevant SVE instructions. This tutorial will also demonstrate how the Arm Instruction Emulator may be used to execute SVE codes and gage the quality of SVE vectorization. Attendees will complete the tutorial with a working understanding of SVE and knowledge of how SVE may be used in their applications.
Content Level 30% beginner, 60% intermediate, 10% advanced
Target AudienceHPC application developers interested in optimizing for the next generation of Arm-based HPC systems.
PrerequisitesTutorial exercises will be performed on a remote Arm-based system. Generic student accounts will be provided the morning of the tutorial. A laptop computer with SSH and X11, e.g. Linux or macOS, is required. Windows users are strongly advised to bring a Linux virtual machine or similar. A basic understanding of vectorization and HPC application development is strongly recommended. Prior experience with vector instruction sets is helpful, but is not required.