Challenges and Opportunities in the Co-design of Convolutions and RISC-V Vector Processors
Paper i proceeding, 2023

The RISC-V "V"extension introduces vector processing to the RISC-V architecture. Unlike most SIMD extensions, it supports long vectors which can result in significant improvement of multiple applications. In this paper, we present our ongoing research to implement and optimize a vectorized Winograd algorithm used in convolutional layers on RISC-V Vector(RISC-VV) processors. Our study identifies effective techniques for optimizing the kernels of Winograd on RISC-VV using intrinsic instructions, and showcases how certain instructions offer better performance. Our co-design findings suggest that the Winograd algorithm benefits from vector lengths up to 2048 bits and cache sizes up to 64MB. We use our experience with Winograd to highlight potential enhancements for the standard that would simplify code generation and aid low-level programming. Finally, we share our experience from experimenting with forks of gem5 for RISC-VV and stress the importance of a mature software ecosystem, to facilitate design space exploration and architectural optimization. Our study identifies effective techniques for optimizing the kernels of Winograd on RISC-VV using the available intrinsic instructions and showcases that certain instructions offer better performance to the vectorized algorithm. Furthermore, our co-design study reveals that the Winograd algorithm benefits from vector lengths up to 2048 bits and cache sizes up to 64MB.

RISC-V Vector extension

Winograd

Performance

ARM-SVE

Optimization

Författare

Sonia Rani Gupta

Chalmers, Data- och informationsteknik, Datorteknik

Nikela Papadopoulou

Chalmers, Data- och informationsteknik, Datorteknik

Miquel Pericas

Chalmers, Data- och informationsteknik, Datorteknik

ACM International Conference Proceeding Series

Vol. 2023 1550-1556
9798400707858 (ISBN)

2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
Denver, USA,

P4PIM: Principer för effektbegränsadHPC-programmering för PIM-nätverk

Vetenskapsrådet (VR) (2020-04892), 2021-01-01 -- 2024-12-31.

Ämneskategorier

Datavetenskap (datalogi)

Datorsystem

DOI

10.1145/3624062.3624232

Mer information

Senast uppdaterat

2024-01-25