Non-blocking Synchronization: Algorithms and Performance Evaluation
Doctoral thesis, 2003
The thesis investigates non-blocking synchronization in shared memory systems, in particular in high performance shared memory multiprocessors and real-time shared memory systems. We explore the performance impact of non-blocking synchronization in high performance shared memory multiprocessors and the applicability of non-blocking synchronization in real-time systems.
The performance advantage of non-blocking synchronization over mutual exclusion in shared memory multiprocessors has been advocated by the theory community for a long time. In this work, we try to make non-blocking synchronization appreciated by application designers and programmers through a sequence of results. First, we develop a non-blocking FIFO queue algorithm which is simple and can be used as a building block for applications and libraries. The algorithm is fast and scales very well in both symmetric and non-symmetric shared memory multiprocessors. Second, we implement a fine-grain parallel Quicksort using non-blocking synchronization. Although fine-grain parallelism has been thought to be inefficient for computations like sorting due to synchronization overhead, we show that efficiency can be achieved by incorporating non-blocking techniques for sharing data and computation tasks in the design and implementation of the algorithm. Finally, we investigate how performance and speedup of applications would be affected by using non-blocking rather than blocking synchronization in parallel systems. We show that for many applications, non-blocking synchronization leads to significant speedup for a fairly large number of processors, while they never slow the applications down.
Predictability is the dominant factor in performance matrices of real-time systems and a necessary requirement for non-blocking synchronization in real-time multiprocessors. In this thesis, we propose two non-blocking data structures with predictable behavior and present an inter-process coordination protocol that bounds the execution time of lock-free shared data object operations in real-time shared memory multiprocessors. The first data structure is a non-blocking buffer for real-time multiprocessors. The buffer gives a way to concurrent real-time tasks to read and write shared data and allows multiple write operations and multiple read operations to be executed concurrently and has a predictable behavior. Another data structure is a special wait-free queue for real-time systems. We present efficient algorithmic implementations for the queue. These queue implementations can be used to enable communication between real-time tasks and non-real-time tasks in systems. The inter-process protocol presented is a general protocol which gives predictable behavior to any lock-free data structure in real-time multiprocessors. The protocol works for the lock-free implementations in real-time multiprocessor systems in the same way as the multiprocessor priority ceiling protocol (MPCP) works for mutual exclusion in real-time multiprocessors. With the new protocol, the worst case execution time of accessing a lock-free shared data object can be bounded.