Modeling the performance of atomic primitives on modern architectures
Paper i proceeding, 2019

Utilizing the atomic primitives of a processor to access a memory location atomically is key to the correctness and feasibility of parallel software systems. The performance of atomics plays a significant role in the scalability and overall performance of parallel software systems. In this work, we study the performance -in terms of latency, throughput, fairness, energy consumption- of atomic primitives in the context of the two common software execution settings that result in high and low contention access on shared memory. We perform and present an exhaustive study of the performance of atomics in these two application contexts and propose a performance model that captures their behavior. We consider two state-of-the-art architectures: Intel Xeon E5, Xeon Phi (KNL). We propose a model that is centered around the bouncing of cache lines between threads that execute atomic primitives on these shared cache lines. The model is very simple to be used in practice and captures the behavior of atomics accurately under these execution scenarios and facilitate algorithmic design decisions in multi-threaded programming.

Modeling

Performance

Synchronization

Concurrency

Atomic Primitives

Parallel Computing

Författare

Fazeleh Sadat Hoseini

Chalmers, Data- och informationsteknik, Nätverk och system

Aras Atalar

Chalmers, Data- och informationsteknik, Nätverk och system

Philippas Tsigas

Chalmers, Data- och informationsteknik, Nätverk och system

ACM International Conference Proceeding Series

a28
978-1-4503-6295-5 (ISBN)

48th International Conference on Parallel Processing, ICPP 2019
Kyoto, Japan,

Ämneskategorier (SSIF 2011)

Datorteknik

Datavetenskap (datalogi)

Datorsystem

DOI

10.1145/3337821.3337901

Mer information

Senast uppdaterat

2024-01-03