CNN and RVV Co-design for Efficient Model Serving
Paper i proceeding, 2025

Convolutional algorithm performance depends on layer dimensions, with SIMD demands and cache sharing influencing runtime selection. To identify the best settings, we perform a co-design exploration of convolutional layer parameters and three algorithms: Direct, im2col+GEMM, and Winograd, jointly with hardware parameters for RISC-V vector architectures. Our results show that incorporating hardware parameters with layer dimensions boosts execution time and efficiency, emphasizing the need for co-design.

Författare

Sonia Rani Gupta

Chalmers, Data- och informationsteknik, Datorteknik

Nikela Papadopoulou

University of Glasgow

Jing Chen

Chalmers, Data- och informationsteknik, Datorteknik

Miquel Pericas

Chalmers, Data- och informationsteknik, Datorteknik

Debs 2025 Proceedings of the 19th ACM International Conference on Distributed and Event Based Systems

243-244
9798400713323 (ISBN)

19th ACM International Conference on Distributed and Event-Based Systems, DEBS 2025
Gothenburg, Sweden,

P4PIM: Principer för effektbegränsadHPC-programmering för PIM-nätverk

Vetenskapsrådet (VR) (2020-04892), 2021-01-01 -- 2024-12-31.

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Datorsystem

DOI

10.1145/3701717.3733226

Mer information

Senast uppdaterat

2025-08-29