On Composability, Efficient Design and Memory Reclamation of Lock-free Data Structures
Doctoral thesis, 2014
The transition to multicore processors has brought synchronization, a fundamental challenge in computer science, into focus. In looking for solutions to the problem, interest has developed in the lock-free approach, which has been proven to achieve several advantages over the traditional mutual exclusion approach. This thesis studies challenges in interprocess synchronization in shared memory multicore systems using the lock-free approach. Our contributions focus on efficient design and implementation, composition, and dynamic memory reclamation of lock-free data structures, a key component in lock-free solutions to synchronization problems.
First, we show that lock-free synchronization offers several advantages. Lock-free implementations of data structures can achieve decent throughput performance while managing to provide competitive fairness among the sharing participants in accessing the shared data. We also show that although lock-freedom does not guarantee starvation-freedom, it is composable in terms of the progress guarantee. Multiple lock-free data objects can concurrently use another lock-free object without compromising their lock-free progress guarantees because operations they invoke at that object get starved.
Having shown that lock-free synchronization possesses several advantages, we then propose lock-free implementations of data structures, as they play a vital role in solving synchronization problems. We present a lock-free hash table based on cuckoo hashing scheme and a lock-free skip-list with extended functionality.
Cuckoo hashing uses two hash tables to offer two positions for any key, so hashing conflicts are solved efficiently and simply by placing conflicted keys in different positions. We develop a lock-free implementation by addressing challenges in manipulating elements in their two possible positions. The evaluation results show that our lock-free cuckoo hash table outperforms other state-of-the-art hash tables in the literature. The extended functionality for the skip-list is motivated by the parallelization of mark-split, an algorithm in the literature designed to reclaim unused memory.
Programming lock-free data structures raises a challenge in reclamation of dynamic memory; this is the subject that we study in the last part of the thesis. Reclaiming dynamically allocated memory blocks of data structures has always been a big issue, because they can be accessed, removed, or freed by any parallel processes. In lock-free programming the problem becomes even more complicated; because no process is allowed to wait for others. Automatic memory reclamation, or garbage collection, can free programmers from such a challenging task by safely reclaiming memory blocks that are no longer used. Based on the introduced skip-list, we propose a parallel design and implementation of the mark-split, a garbage collection algorithm that collects garbage using two steps: "mark" live objects and "split" free memory chunks to exclude occupied spaces from free memory. Furthermore, we address performance bottlenecks in the garbage collection when working on Non-uniform Memory Access (NUMA) multicore systems and introduce a NUMA-aware mark-compact garbage collector which is implemented in the OpenJDK's HotSpot virtual machine.
Mark-Compact
Garbage Collection
Composability
Multicore Programming
Non-blocking
Concurrent Data Structure
Synchronization
Lock-free
Mark-Split
NUMA
Parallel Garbage Collection