Approximation and Compression Techniques to Enhance Performance of Graphics Processing Units
Doctoral thesis, 2020
This thesis provides a set of approximation and compression techniques for GPUs, with the goal of efficiently utilizing the computational fabric, and thereby increase performance. The thesis shows that these techniques can substantially lower the amount of information the system has to process, and are thus important tools in the process of meeting challenges in memory utilization.
This thesis makes contributions within three areas: controlled floating-point precision reduction, lossless and lossy memory compression, and distributed training of neural networks. In the first area, the thesis shows that through automated and controlled floating-point approximation, the register file can be more efficiently utilized. This is achieved through a framework which establishes a cross-layer connection between the application and the microarchitecture layer, and a novel register file organization capable of leveraging low-precision floating-point values and narrow integers for increased capacity and performance.
Within the area of compression, this thesis aims at increasing the effective bandwidth of GPUs by presenting a lossless and lossy memory compression algorithm to reduce the amount of transferred data. In contrast to state-of-the-art compression techniques such as Base-Delta-Immediate and Bitplane Compression, which uses intra-block bases for compression, the proposed algorithm leverages multiple global base values to reach a higher compression ratio. The algorithm includes an optional approximation step for floating-point values which offers higher compression ratio at a given, low, error rate.
Finally, within the area of distributed training of neural networks, this thesis proposes a subgraph approximation scheme for graph data which mitigates accuracy loss in a distributed setting. The scheme allows neural network models that use graphs as inputs to converge at single-machine accuracy, while minimizing synchronization overhead between the machines.
Compression
Approximate Computing
Register File
Machine Learning
Floating-Point Precision
Microarchitecture
GPU
Author
Alexandra Angerd
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
A Framework for Automated and Controlled Floating-Point Accuracy Reduction in Graphics Applications on GPUs
Transactions on Architecture and Code Optimization,;Vol. 14(2017)
Journal article
A GPU Register File using Static Data Compression
ACM International Conference Proceeding Series,;(2020)
Paper in proceeding
GBDI: Going Beyond Base-Delta-Immediate Compression with Global Bases
Proceedings - International Symposium on High-Performance Computer Architecture,;Vol. 2022-April(2022)p. 1115-1127
Paper in proceeding
A. Angerd, K. Balasubramanian, M. Annavaram. Distributed Training of Graph Convolutional Networks using Subgraph Approximation
One important technique towards alleviating this challenge is data compression, in which the information is encoded in a more compact format. Compression can be either lossless or lossy. When using a lossless compression technique, it is possible to reconstruct the original data without any loss of information. In contrast, lossy techniques encode the data by leaving out less important information. This is achieved through approximation of the original data.
The thesis proposes a set of approximation and compression techniques for Graphics Processing Units (GPUs), which help them to access data faster. The thesis shows that these techniques can increase the performance of GPUs.
One new insight made in this thesis is that controlled approximation can increase performance while still delivering high-quality results. Controlled approximation means that the quality of the output is guaranteed to stay above a certain pre-defined quality threshold. This indicates that approximations can be used to increase performance in a wide range of applications.
ACE: Approximate Algorithms and Computing Systems
Swedish Research Council (VR) (2014-6221), 2015-01-01 -- 2018-12-31.
Subject Categories
Computer Engineering
Computer Systems
Areas of Advance
Information and Communication Technology
ISBN
978-91-7905-425-0
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4892
Technical report D - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University: 192D
Publisher
Chalmers
Zoom (password request: erik.sintorn@chalmers.se)
Opponent: Prof. Natalie Enright Jerger, University of Toronto, ON, Canada