Exploiting the Potential of Flexible Processing Units
Paper in proceeding, 2023

In order to meet the increased computational demands and stricter power constraints of modern applications, architectures have evolved to include domain-specific accelerators. In order to design efficient accelerators, three main challenges need to be addressed: compute, memory, and control. Moreover, since SoCs usually contain multiple accelerators, selecting the right one for each task also become crucial. This becomes specially relevant in Flexible Processing Units (xPUs), processing units that provide multiple functionalities with the same hardware. While it is possible to use shared support components for all functionalities, this will lead to sub-optimal performance.
In this work, we take one example of such xPU, and analyze the aspects which have not yet been fully addressed, showing that there is more potential to be exploited. By  understanding the required memory patterns, we can achieve up to 72% speedup gains compared to using the memory support optimized for a different functionality. Furthermore, we propose an in-depth analysis of the different functionalities provided by the xPU. We then leverage the insights obtained from this analysis by providing a  mechanism that selects the right functionality, maximizing hardware utilization.

Scientific Computing

Systolic Array

GEMM

Flexible Processing Unit

Vector Unit

DNN

Author

Mateo Vázquez Maceiras

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Muhammad Waqar Azhar

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Pedro Petersen Moura Trancoso

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Proceedings - Symposium on Computer Architecture and High Performance Computing

15506533 (ISSN)

34-45
979-8-3503-0549-4 (ISBN)

2023 IEEE 35th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
Porto Alegre, Brazil,

Very Efficient Deep Learning in IOT (VEDLIoT)

European Commission (EC) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.

European, extendable, energy-efficient, energetic, embedded, extensible, Processor Ecosystem (eProcessor)

European Commission (EC) (EC/H2020/956702), 2021-01-01 -- 2024-06-30.

Subject Categories

Computer Engineering

Computer Science

Computer Systems

Driving Forces

Sustainable development

DOI

10.1109/SBAC-PAD59825.2023.00013

More information

Latest update

11/22/2024