DNNOPT: A Framework for Efficiently Selecting On-chip Memory Loop Optimizations of DNN Accelerators
Paper i proceeding, 2024

Deep neural network (DNN) accelerators suffer from poor utilization of on-chip memory which potentially reduces performance and energy efficiency. Loop reordering and blocking are used to improve on-chip memory utilization in DNN accelerators. However, existing optimization frameworks are inefficient due to either a prohibitive time complexity of searching the entire search space or due to a sub-optimal choice of optimizations. This paper proposes DNNOPT - ahardware/software framework for optimally selecting loop order and blocking factors, for loop reordering and blocking in isolation or in combination. DNNOPT uses proposed Early exit and Strided search strategies to prune the search space and simple analytical models of data reuse to evaluate each optimization point efficiently and accurately. Overall, DNNOPT reduces the search space by more than two orders of magnitude and improves performance, energy efficiency and time to solution, on average, by 1.8×, 50%, and 226×, respectively, of convolutional neural network (CNN) and Transformer applications compared to state-of-the-art frameworks.

On-chip Memory Management

Loop Re-Order

Energy Efficient DNN Acceleration

Reuse Distance

Loop Blocking

DNN acceleration

Författare

[Person 8c704dc8-5284-4356-a445-8be974910db3 not found]

[Person a9d20096-107d-45c6-a31d-93655f0844c8 not found]

[Person 751c2d94-3a3b-4297-9f42-ce528733bd46 not found]

Proceedings of the 21st ACM International Conference on Computing Frontiers, CF 2024

126-137
9798400705977 (ISBN)

21st ACM International Conference on Computing Frontiers, CF 2024
Ischia, Italy,

Ämneskategorier (SSIF 2011)

Datorsystem

DOI

10.1145/3649153.3649196

Mer information

Senast uppdaterat

2024-08-07