ARADA: Adaptive Resource Allocation for Improving Energy Efficiency in Deep Learning Accelerators
Paper i proceeding, 2023
This paper proposes an adaptive resource allocation for deep learning applications (ARADA) with the goal of improving energy efficiency for deep learning accelerators. This is leveraged by having a layer-by-layer resource allocation. The rationale is that each layer in the DL model has a unique compute and memory bandwidth requirement and allocating fixed resources to all layers leads to inefficiencies. This can be achieved by means of resource allocation (e.g., voltage-frequency, memory bandwidth) to save energy without sacrificing performance. Experimental results show that applying ARADA to the execution of 9 state-of-the-art CNN models results in an energy savings of 38% on average compared to race-to-idle for an Edge TPU coupled with LPDDR4 off-chip memory.
Energy Efficiency
Resource Allocation
CNNs
Accelerators
Författare
Muhammad Waqar Azhar
Chalmers, Data- och informationsteknik, Datorteknik
Stavroula Zouzoula
Chalmers, Data- och informationsteknik, Datorteknik
Pedro Petersen Moura Trancoso
Chalmers, Data- och informationsteknik, Datorteknik
Proceedings of the 20th ACM International Conference on Computing Frontiers 2023, CF 2023
63-72
979-8-4007-0140-5 (ISBN)
Bologna, Spain,
Very Efficient Deep Learning in IOT (VEDLIoT)
Europeiska kommissionen (EU) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.
Ämneskategorier
Energisystem
Datavetenskap (datalogi)
Datorsystem
DOI
10.1145/3587135.3592207