ARADA: Adaptive Resource Allocation for Improving Energy Efficiency in Deep Learning Accelerators
Paper in proceeding, 2023
This paper proposes an adaptive resource allocation for deep learning applications (ARADA) with the goal of improving energy efficiency for deep learning accelerators. This is leveraged by having a layer-by-layer resource allocation. The rationale is that each layer in the DL model has a unique compute and memory bandwidth requirement and allocating fixed resources to all layers leads to inefficiencies. This can be achieved by means of resource allocation (e.g., voltage-frequency, memory bandwidth) to save energy without sacrificing performance. Experimental results show that applying ARADA to the execution of 9 state-of-the-art CNN models results in an energy savings of 38% on average compared to race-to-idle for an Edge TPU coupled with LPDDR4 off-chip memory.
Energy Efficiency
Resource Allocation
CNNs
Accelerators
Author
Muhammad Waqar Azhar
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Stavroula Zouzoula
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Pedro Petersen Moura Trancoso
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Proceedings of the 20th ACM International Conference on Computing Frontiers 2023, CF 2023
63-72
979-8-4007-0140-5 (ISBN)
Bologna, Spain,
Very Efficient Deep Learning in IOT (VEDLIoT)
European Commission (EC) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.
Subject Categories (SSIF 2011)
Energy Systems
Computer Science
Computer Systems
DOI
10.1145/3587135.3592207