Vectorized Barrier and Reduction in LLVM OpenMP Runtime
Paper in proceeding, 2021

Barrier synchronization is a well known operation in parallel processing that can be an obstacle for getting performance in parallel programs, particularly for high thread counts. Similarly, reduction is a collective communication pattern frequently used in parallel applications and needs to be optimized for applications to achieve their best performance. With the introduction of multi-core and many-core processors several new barrier and reduction implementations have been proposed. As the number of cores per node continues to grow, implementation of these primitives need to be revisited and adapted for upcoming architectures. We see an opportunity to improve synchronization by exploiting vector units present in modern and future CPU designs based on vector ISAs such as ARM’s Scalable Vector Extension and the RISC-V Vector extension. In this work we propose vectorized barriers and reductions using the vector length agnostic paradigm and implement them in the LLVM OpenMP runtime. Our barrier implementation achieves up to 2.2 × and 1.4 × speedup over the default LLVM OpenMP implementation on Intel KNL and Fujitsu A64FX, respectively.

Reduction

Vectorization

OpenMP

Barrier

Author

Muhammad Nufail Farooqi

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Miquel Pericas

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 12870 LNCS 18-32
9783030852610 (ISBN)

17th International Workshop on OpenMP, IWOMP 2021
Bristol, United Kingdom,

The European Processor Initiative (EPI)

European Commission (EC) (EC/H2020/800928), 2018-12-01 -- 2021-11-30.

Subject Categories (SSIF 2011)

Computer Engineering

Embedded Systems

Computer Systems

DOI

10.1007/978-3-030-85262-7_2

More information

Latest update

10/4/2021