Exploiting the Potential of Flexible Processing Units
Paper in proceeding, 2023
In this work, we take one example of such xPU, and analyze the aspects which have not yet been fully addressed, showing that there is more potential to be exploited. By understanding the required memory patterns, we can achieve up to 72% speedup gains compared to using the memory support optimized for a different functionality. Furthermore, we propose an in-depth analysis of the different functionalities provided by the xPU. We then leverage the insights obtained from this analysis by providing a mechanism that selects the right functionality, maximizing hardware utilization.
Scientific Computing
Systolic Array
GEMM
Flexible Processing Unit
Vector Unit
DNN
Author
Mateo Vázquez Maceiras
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Muhammad Waqar Azhar
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Pedro Petersen Moura Trancoso
Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)
Proceedings - Symposium on Computer Architecture and High Performance Computing
15506533 (ISSN)
34-45979-8-3503-0549-4 (ISBN)
Porto Alegre, Brazil,
Very Efficient Deep Learning in IOT (VEDLIoT)
European Commission (EC) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.
European, extendable, energy-efficient, energetic, embedded, extensible, Processor Ecosystem (eProcessor)
European Commission (EC) (EC/H2020/956702), 2021-01-01 -- 2024-06-30.
Subject Categories
Computer Engineering
Computer Science
Computer Systems
Driving Forces
Sustainable development
DOI
10.1109/SBAC-PAD59825.2023.00013