Throughput and energy efficiency of lock-free data structures: Execution Models and Analyses

Aras Atalar

Throughput and energy efficiency of lock-free data structures: Execution Models and Analyses
Doktorsavhandling, 2018

Concurrent data structures are key program components to harness the available parallelism in multi-core processors. Lock-free algorithmic implementations of concurrent data structures offer high scalability and possess desirable properties such as immunity to deadlocks, convoying and priority inversion. In this thesis, we develop analytical tools to model and analyze the throughput and energy consumption of concurrent lock-free data structures. We start our study with a general class of lock-free data structures. Then, we target more specialized designs for lock-free queues. Finally, we focus on the search data structures that possess different characteristics compared to previously mentioned data structures.

Performance of lock-free data structures: This thesis contributes to the problem of making ends meet between theoretical bounds and actual measured throughput. As the first step, we consider a general class of lock-free data structures and propose three analytical frameworks with different flavors. Analyses of this class also cover efficient implementations of a set of fundamental data structures that suffer from inherent sequential bottlenecks. We model the executions and examine the impact of contention on the throughput of these algorithms. Our analyses lead to optimization methods on memory management and back-off strategies.

Performance and energy efficiency of lock-free queues: We take a step further to model the throughput of lock-free operations and their interaction. Considering shared queues, as a key paradigm for data sharing, operations (En- queue, Dequeue) access the opposite ends of a queue. Same type of operations might contend with each other on a non-empty queue. However, all types of operations are subject to interaction when the queue is empty. We first decorrelate the throughput of dequeuers’ and enqueuers’ into several uncorrelated basic throughputs, and reconstruct the main throughputs as a function of these basic throughputs. Besides, we model the power dissipation and integrate it with the throughput estimations to extract the energy consumption of applications that utilize lock-free queues.

Performance of lock-free search data structures: Lock-free designs that utilize fine-grained synchronization have produced efficient implementations of search data structures. These designs reveal different characteristics compared to the previous set of lock-free data structures with inherent sequential bottlenecks. We introduce a new way of modeling and analyzing the throughput of search data structures under stationary and memoryless access patterns..

Modeling

Data Structures

Throughput

Concurrency

Energy Efficiency

Parallel Computing

Analysis

Performance

Lock-free

Room SB-H7, Sven Hultins Gata 6, Chalmers University of Technology

Opponent: Prof. Dr. Guy Blelloch, Carnegie Mellon University, USA

Författare

Aras Atalar

Chalmers, Data- och informationsteknik, Nätverk och system

Forskning Andra publikationer

As a first approximation, a computer is composed of a processor and a memory. The processor takes the data from memory, processes it and writes it back to the memory, and repeats this process until it achieves its task. Like a cook that takes the food from the fridge, processes it and puts it back in the fridge.

How can you make the cooking process faster? You can start by speeding up the cook. Then, you can think to use multiple cooks who work together. This obviously has the potential to speed up the process but as the saying goes "too many cooks might spoil the broth". You need to be concerned about the interaction between the cooks if you have many of them. Are they getting along well? Are they working together efficiently? How do they synchronize on shared resources? It would not be nice if a cook attempts to melt chocolate in a pot for the desert while another one is boiling a fish soup in the same pot.

The same story applies to computers. One way to make a computer faster is to increase the speed of its processor. Unfortunately, the law of physics bounds the maximum speed of a processor, and we shall instead use multiple processors that work together to complete a task. Now, you might be wondering how the independent computing units of your multi-core smartphone are getting along with each other inside your pocket? The programmers are responsible for finding answers to this question by designing reliable and efficient concurrent algorithms.

This thesis proposes analytical models. They can be used to describe complex systems and can help programmers to understand, predict and optimize the performance of the concurrent algorithms. Imagine a hypothetical smart kitchen that, without the cooks having to prepare the food, can answer to the following questions: When the food will be ready and how good it will be if the cooks follow this given behavior? This smart kitchen can help to find the best ways to collaborate on the steps of a complicated recipe. This thesis proposes analytical tools that would serve a similar purpose in computing.

Ämneskategorier (SSIF 2011)

Datorteknik

Annan data- och informationsvetenskap

Datavetenskap (datalogi)

ISBN

978-91-7597-783-6

Technical report - Chalmers University of Technology, Department of Computer Engineering, Göteborg: 161D

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4464

Utgivare

Chalmers