Scratchpad Memory Management for Deep Learning Accelerators
Paper i proceeding, 2024

The success of Artificial Intelligence (AI) applications is driven by efficient hardware accelerators. Recent trends show a rapid increase in the application demands, which in most cases surpass the available resources in the accelerators. As such, the efficient management of these limited resources becomes a critical factor in achieving high-performance. In this work we focus on the management of the available on-chip memory resources for Deep Learning (DL) accelerators. While most state-of-the-art accelerators have static buffer separation for different data types, we observed that the heterogeneity of recent DL models demands for more flexible solutions. In this work we propose using all on-chip scratchpad memory, including space for double buffering, in a unified way. To efficiently exploit that space, we propose a memory management technique that can apply different policies to best meet the demands of each different execution phase. For cases when the available memory is less than the requirements, the memory management can use the available space for either optimizing the data reuse or the fetching of data ahead. Comparing our approach against a baseline accelerator shows that the flexibility in the management of the scratchpad memory leads to a considerable reduction of up to 80% of the off-chip memory accesses, or up to 56% of the latency.

Memory Management

Deep Learning Accelerators

Scratchpad

Författare

Stavroula Zouzoula

Chalmers, Data- och informationsteknik, Datorteknik

Mohammad Ali Maleki

Chalmers, Data- och informationsteknik, Datorteknik

Muhammad Waqar Azhar

ZEROPOINT TECHNOLOGIES AB

Pedro Petersen Moura Trancoso

Chalmers, Data- och informationsteknik, Datorteknik

ACM International Conference Proceeding Series

629-639
9798400708428 (ISBN)

53rd International Conference on Parallel Processing, ICPP 2024
Gotland, Sweden,

Ämneskategorier

Datorsystem

DOI

10.1145/3673038.3673115

Mer information

Senast uppdaterat

2024-09-09