Concurrent Algorithms and Data Structures for Many-Core Processors
Doktorsavhandling, 2011

The convergence of highly parallel many-core graphics processors with conventional multi-core processors is becoming a reality. To allow algorithms and data structures to scale efficiently on these new platforms, several important factors needs to be considered. (i) The algorithmic design needs to utilize the inherent parallelism of the problem at hand. Sorting, which is one of the classic computing components in computer science, has a high degree of inherent parallelism. In this thesis we present the first efficient design of Quicksort for graphics processors and show that it performs well in comparison with other available sorting methods. (ii) The work needs to be distributed efficiently across the available processing units. We present an evaluation of a set of dynamic load balancing schemes for graphics processors, comparing blocking methods with non-blocking. (iii) The required synchronization needs to be efficient, composable and easy to use. We present a methodology to easily compose the two most common operations provided by a data structure -- the insertion and deletion of elements. By exploiting a common construction found in most non-blocking data structures, we created a move operation that can atomically move elements between different types of non-blocking data structures, without requiring a specific design for each coupling. We also present, to the best of our knowledge, the first application of software transactional memory to graphics processors. Two different STM designs, one blocking and one obstruction-free, were evaluated on the task of implementing different types of common concurrent data structures on a graphics processor.



graphics processors


software transactional memory


load balancing


EC, EDIT-huset, Chalmers
Opponent: Maged M. Michael, IBM Thomas J. Watson Research Center, New York, USA.


Daniel Cederman

Chalmers, Data- och informationsteknik, Nätverk och system

On Dynamic Load Balancing on Graphics Processors

Proceedings of the 23rd SIGGRAPH/Eurographics Conference on Graphics Hardware,; Vol. 2008(2008)p. 57-64

Paper i proceeding

Supporting Lock-Free Composition of Concurrent Data Objects

Proceedings of the 7th ACM conference on Computing frontiers,; (2010)p. 53-62

Paper i proceeding

GPU-Quicksort: A practical Quicksort algorithm for graphics processors

Journal of Experimental Algorithmics,; Vol. 14(2009)

Artikel i vetenskaplig tidskrift

Towards a Software Transactional Memory for Graphics Processors

Proceedings of the Eurographics Symposium on Parallel Graphics and Visualization 2010,; (2010)

Paper i proceeding


Informations- och kommunikationsteknik


Datavetenskap (datalogi)



Technical report D - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University: 76

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 3184

EC, EDIT-huset, Chalmers

Opponent: Maged M. Michael, IBM Thomas J. Watson Research Center, New York, USA.