FiBHA: Fixed Budget Hybrid CNN Accelerator
Paper i proceeding, 2022
The proposed accelerators belong to two categories at the two ends of the design spectrum. In the first category, the accelerators contain a minimal number of dedicated engines such that all the layers of one type (e.g. depthwise convolutions) are handled by one engine. In the second, they have one dedicated engine per layer. While the first category addresses the inter-layer-type heterogeneity, it cannot capture the heterogeneity among layers of the same type. The second category is resource-demanding. In this paper, we propose a hybrid architecture that combines design concepts from both categories in a way that captures more heterogeneity than the first category and is more resource-efficient than the second. To derive a hybrid accelerator given a fixed resource budget, we propose a heuristic that splits the CNN and the available resources between the components of the hybrid architecture. The proposed architecture is implemented and evaluated using high-level synthesis (HLS) on an FPGA. For a fixed hardware budget, the hybrid accelerator achieves up to 1.7x and 4.1x of the throughput achieved by state-of-the-art accelerators of the two categories.
deep learning
hybrid accelerator
hardware/software co-design
pipelined accelerator
Convolutional neural networks (CNNs)
Författare
Fareed Mohammad Qararyah
Chalmers, Data- och informationsteknik, Datorteknik
Muhammad Waqar Azhar
Chalmers, Data- och informationsteknik, Datorteknik
Pedro Petersen Moura Trancoso
Chalmers, Data- och informationsteknik, Datorteknik
Proceedings - Symposium on Computer Architecture and High Performance Computing
15506533 (ISSN)
180-190978-1-6654-5155-0 (ISBN)
Bordeaux, France,
Very Efficient Deep Learning in IOT (VEDLIoT)
Europeiska kommissionen (EU) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.
Ämneskategorier
Datorteknik
Datavetenskap (datalogi)
Datorsystem
DOI
10.1109/SBAC-PAD55451.2022.00029