Embedded Languages for Data-Parallel Programming
Doctoral thesis, 2013

Computers today are becoming more and more parallel. General purpose processors (CPUs) have multiple processing cores and Single Instruction Multiple Data (SIMD) units for data-parallelism. Graphics processors (GPUs) bring massive parallelism at the cost of being harder to program than CPUs. This thesis applies embedded language methodology to data-parallel programming. Two embedded languages are presented, Obsidian for general purpose GPU programming and EmbArBB for data-parallel programming across platforms. CPUs and GPUs get more parallel resources with each new generation. The question of how to efficiently program these processors arises. We are after efficiency both in programmer productivity and in application performance. Using embedded languages allows us to experiment with what abstractions to present to the programmer at relatively little effort. Obsidian is an embedded language for general purpose programming of GPUs. We try to strike a balance between high level, productivity increasing abstractions and low-level control needed for performance. The Obsidian programming model mirrors the GPU architecture and the programmer is constrained into writing GPU-friendly code. Hierarchy level polymorphic library functions are supplied to make these constraints feel less obtrusive. Obsidian programs are compiled into CUDA C code. This compilation is based on a simple and elegant monad reification technique. In cases where the programmer is not interested in low-level details or wants the program to run over a range of hardware, a higher level language can be used. EmbArBB is a Haskell embedding or the Intel ArBB system. EmbArBB relies on the ArBB system to generate code (via a Just-In-Time compiler) to a range of hardware. EmbArBB embeds a preexisting library for data-parallelism into Haskell and we obtain very good performance at little implementation effort. This performance comes from the expertise and effort put into the ArBB system and that we get for free. Embedding ArBB is a way to provide these benefits to the Haskell programmer and a way to increase usefulness of an existing system by opening it up to a wider audience. Obsidian is very different; it is not based on a set of high-level parallel primitives. The Obsidian programmer can implement these primitives in different ways and then select the best one. We have obtained very good performance in case studies involving reductions. Obsidian programs are also more terse and composable, compared to CUDA.

Graphics Processing Units

Data-parallelism

Functional Programming

Embedded languages

EA
Opponent: Prof. Stephen A. Edwards

Author

Joel Bo Svensson

Chalmers, Computer Science and Engineering (Chalmers), Software Technology (Chalmers)

Obsidian: A Domain Specific Embedded Language for Parallel Programming of Graphics Processors

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),; Vol. 5836(2011)p. 156-173

Paper in proceeding

Counting and Occurrence Sort for GPUs using an Embedded Language

The 2nd ACM SIGPLAN Workshop on Functional High-Performance Computing, FHPC'13,; Vol. 48(2013)p. 37-45

Paper in proceeding

Programming Future Parallel Architectures with Haskell and Intel ArBB

Future Architectural Support for Parallel Programming (FASPP'11),; (2011)

Paper in proceeding

Parallel Programming in Haskell Almost for Free: an embedding of Intel's Array Building Blocks

1st ACM SIGPLAN Workshop on Functional High Performance Computing, FHPC 2012. Copenhagen, 15 September 2012,; (2012)p. 3-14

Paper in proceeding

Simple and Compositional Reification of Monadic Embedded Languages: Functional pearl

The 18th ACM SIGPLAN International Conference on Functional Programming, ICFP'13,; (2013)p. 299-304

Paper in proceeding

Expressive array constructs in an embedded GPU kernel programming language

Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming, DAMP'12,; (2012)p. 21-30

Paper in proceeding

GPGPU Kernel Implementation and Refinement using Obsidian

ICCS 2010 conference proceedings; Amsterdam, NETHERLANDS, MAY 31-JUN 02, 2010,; Vol. 1(2010)p. 2059-2068

Paper in proceeding

Areas of Advance

Information and Communication Technology

Subject Categories

Software Engineering

ISBN

978-91-7385-939-4

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie

EA

Opponent: Prof. Stephen A. Edwards

More information

Created

10/7/2017