Non-blocking programming on multi-core graphics processors
Artikel i vetenskaplig tidskrift, 2009

This paper investigates the synchronization power of coalesced memory accesses, a family of memory access mechanisms introduced in recent large multicore architectures like the CUDA graphics processors. We first design three memory access models to capture the fundamental features of the new memory access mechanisms. Subsequently, we prove the exact synchronization power of these models in terms of their consensus numbers. These tight results show that the coalesced memory access mechanisms can facilitate strong synchronization between the threads of multicore processors, without the need of synchronization primitives other than reads and writes. Moreover, based on the intrinsic features of recent GPU architectures, we construct strong synchronization objects like wait-free and t-resilient read-modify-write objects for a general model of recent GPU architectures without strong hardware synchronization primitives like test-and-set and compare-and-swap. Accesses to the wait-free objects have time complexity O(N), where N is the number of processes. Our result demonstrates that it is possible to construct waitfree synchronization mechanisms for GPUs without the need of strong synchronization primitives in hardware and that wait-free programming is possible for GPUs.

wait-free programming

GPU architectures

Författare

Phuong Ha

Philippas Tsigas

Chalmers, Data- och informationsteknik, Nätverk och system

Otto Anshus

SIGARCH Computer Architecture News

0163-5964 (ISSN)

Vol. 36 19-28

Ämneskategorier

Datorteknik

Programvaruteknik

Datavetenskap (datalogi)