A GPU Register File using Static Data Compression
Paper in proceeding, 2020

GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation.

Data compaction and compression

Approximation

Graphics processors

Micro-architecture implementation considerations

Author

Alexandra Angerd

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Erik Sintorn

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Per Stenström

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

ACM International Conference Proceeding Series

3404431
978-145038816-0 (ISBN)

ICPP ’20: 49th International Conference on Parallel Processing - ICPP
Edmonton, Canada,

ACE: Approximate Algorithms and Computing Systems

Swedish Research Council (VR) (2014-6221), 2015-01-01 -- 2018-12-31.

Subject Categories

Computer Engineering

Software Engineering

Computer Systems

Areas of Advance

Information and Communication Technology

DOI

10.1145/3404397.3404431

More information

Latest update

10/16/2020