A GPU Register File using Static Data Compression
Paper in proceedings, 2020

GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation.

Data compaction and compression

Approximation

Graphics processors

Micro-architecture implementation considerations

Author

Alexandra Angerd

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers), Computer Systems

Erik Sintorn

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers), Real-time and Computer Graphics Systems

Per Stenström

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

ACM International Conference Proceeding Series

3404431

ICPP ’20: 49th International Conference on Parallel Processing - ICPP
Edmonton, Canada,

ACE: Approximate Algorithms and Computing Systems

Swedish Research Council (VR), 2015-01-01 -- 2018-12-31.

Subject Categories

Computer Engineering

Software Engineering

Computer Systems

Areas of Advance

Information and Communication Technology

DOI

10.1145/3404397.3404431

More information

Latest update

10/16/2020