Efficient concurrent data structure access parallelism techniques for increasing scalability
Doctoral thesis, 2023
In the first part of the thesis, we focus on data structure semantic relaxation. By relaxing the semantics of a data structure, a bigger design space, that allows weaker synchronization and more useful parallelism, is unveiled. Investigating new data structure designs, capable of trading semantics for achieving better performance in a monotonic way, is a major challenge in the area. We algorithmically address this challenge in this part of the thesis. We present an efficient, lock-free, concurrent data structure design framework for out-of-order semantic relaxation. We introduce a new two-dimensional algorithmic design, that uses multiple instances of a given data structure to improve access parallelism.
In the second part of the thesis, we propose an efficient priority queue that improves access parallelism by reducing the number of synchronization points for each operation. Priority queues are fundamental abstract data types, often used to manage limited resources in parallel systems. Typical proposed parallel priority queue implementations are based on heaps or skip lists. In recent literature, skip lists have been shown to be the most efficient design choice for implementing priority queues. Though numerous intricate implementations of skip list based queues have been proposed in the literature, their performance is constrained by the high number of global atomic updates per operation and the high memory consumption, which are proportional to the number of sub-lists in the queue. In this part of the thesis, we propose an alternative approach for designing lock-free linearizable priority queues, that significantly improve memory efficiency and throughput performance, by reducing the number of global atomic updates and memory consumption as compared to skip-list based queues. To achieve this, our new design combines two structures; a search tree and a linked list, forming what we call a Tree Search List Queue (TSLQueue).
Subsequently, we analyse and introduce a model for lock-free concurrent data structure access parallelism. The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access points, leading to thread serialisation, and hindering parallelism. Aiming to address this challenge, a significant amount of work in the literature has proposed multi-access techniques that improve concurrent data structure parallelism. However, there is little work on analysing and modelling the execution behaviour of concurrent multi-access data structures especially in a shared memory setting. In this part of the thesis, we analyse and model the general execution behaviour of concurrent multi-access data structures in the shared memory setting. We study and analyse the behaviour of the two popular random access patterns: shared (Remote) and exclusive (Local) access, and the behaviour of the two most commonly used atomic primitives for designing lock-free data structures: Compare and Swap, and, Fetch and Add.
FIFO queue
parallelism
performance modelling
design framework
semantic relaxation
Data structure
multi-access
lock free
performance analysis
counter
concurrency
search tree.
stack
multi-core processor
priority queue
Author
Adones Rukundo
Network and Systems
Monotonically relaxing concurrent data-structure semantics for increasing performance: An efficient 2D design framework
Leibniz International Proceedings in Informatics, LIPIcs,;Vol. 146(2019)
Paper in proceeding
TSLQueue: An Efficient Lock-Free Design for Priority Queues
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 12820 LNCS(2021)p. 385-401
Paper in proceeding
Performance Analysis and Modelling of Concurrent Multi-access Data Structures
Annual ACM Symposium on Parallelism in Algorithms and Architectures,;Vol. SPA 22(2022)p. 333-344
Paper in proceeding
Consider a job of loading a given number of boxes into a single truck at a warehouse using multiple workers. The challenge here is how to count the number of loaded boxes so that a given number is not exceeded. A basic way to solve this is to have a single list onto which the workers tally the number of boxes they have loaded. Each worker has to check the tally for the current number of boxes loaded before they can load a box onto the truck. If the tally is less than the required number, the worker increases the tally by one and loads another box onto the truck. Otherwise the workers stop loading and the truck sets off. With a single tally list, the loading process will be slow since the list can only be accessed by one worker at a time. This therefore means that increasing the number of workers might not increase (scale) the loading speed since they have to queue up (delay) to access the tally list.
To avoid queuing on a shared list, each worker can be assigned a specific number of boxes to load. This way, each worker tracks their own count without sharing a list. The loading process is complete once all the workers have loaded their specific number of boxes.
In this case, the loading process duration will be determined by the slowest worker irrespective of how many fast workers are involved.
A more efficient way to solve this problem is to have multiple lists, each having a maximum tally thresh hold. Each worker can select a random list on which to tally before loading a box for as long as the list is below the given tally thresh hold. Otherwise if all the lists have reached the maximum tally threshold, the loading process is complete and the truck can set off. Here, the loading process doesn't have to be delayed by slow workers since there is not limit on individual workers. Faster workers can load more boxes than slow workers. At the same time, workers do not have to queue up on a single shared list. Similar to concurrent computing, the primary goal here is to efficiently harnessing the work force of multiple workers without losing count of the boxes being loaded.
Future factories in the Cloud (FiC)
Swedish Foundation for Strategic Research (SSF) (GMT14-0032), 2016-01-01 -- 2020-12-31.
Sweden-East Africa University Network knowledge development for sustainable development
The Swedish Foundation for International Cooperation in Research and Higher Education (STINT) (SG2021-8934), 2022-01-08 -- 2024-12-31.
Areas of Advance
Information and Communication Technology
Building Futures (2010-2018)
Driving Forces
Sustainable development
Innovation and entrepreneurship
Subject Categories
Computer and Information Science
Computer Science
Roots
Basic sciences
ISBN
978-91-7905-837-1
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5303
Publisher
Chalmers
Room HA2, Johanneberg campus, Chalmers (https://maps.chalmers.se/#0bb94e4a-61cf-45e6-a197-260258e605ce)
Opponent: Prof. Peter Sanders, Karlsruhe Institute of Technology, Germany