DescriptionWith the increasing prevalence of multicore processors, shared-memory programming models are essential. OpenMP is a popular, portable, widely supported, and easy-to-use shared-memory model. Developers usually find OpenMP easy to learn. However, they are often disappointed with the performance and scalability of the resulting code. This disappointment stems not from shortcomings of OpenMP, but rather from the lack of depth with which it is employed. Our “Advanced OpenMP Programming” tutorial addresses this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance.
We assume attendees understand basic parallelization concepts and know the fundamentals of OpenMP. We focus on performance aspects, such as data and thread locality on NUMA architectures, false sharing, and exploitation of vector units. All topics are accompanied by extensive case studies, and we discuss the corresponding language features in-depth. Continuing the emphasis of this successful tutorial series, we focus solely on performance programming for multi-core architectures. Throughout all topics, we present the recent additions of OpenMP 5.0, 5.1 and 5.2 and comment on developments targeting OpenMP 6.0.