At the Locus of Performance: Quantifying the Effects of Copious 3D-Stacked Cache on HPC Workloads
Artikel i vetenskaplig tidskrift, 2023

Over the last three decades, innovations in the memory subsystem were primarily targeted at overcoming the data movement bottleneck. In this paper, we focus on a specific market trend in memory technology: 3D-stacked memory and caches. We investigate the impact of extending the on-chip memory capabilities in future HPC-focused processors, particularly by 3D-stacked SRAM. First, we propose a method oblivious to the memory subsystem to gauge the upper-bound in performance improvements when data movement costs are eliminated. Then, using the gem5 simulator, we model two variants of a hypothetical LARge Cache processor (LARC), fabricated in 1.5 nm and enriched with high-capacity 3D-stacked cache. With a volume of experiments involving a broad set of proxy-applications and benchmarks, we aim to reveal how HPC CPU performance will evolve, and conclude an average boost of 9.56× for cache-sensitive HPC applications, on a per-chip basis. Additionally, we exhaustively document our methodological exploration to motivate HPC centers to drive their own technological agenda through enhanced co-design.

proxy-applications

3d-stacked memory

emerging architecture study

gem5 simulation

Författare

Jens Domke

RIKEN

Emil Vatai

RIKEN

Balazs Gerofi

Intel Corporation

Yuetsu Kodama

RIKEN

Mohamed Wahib

RIKEN

Artur Podobas

Kungliga Tekniska Högskolan (KTH)

Sparsh Mittal

IIT Roorkee

Miquel Pericas

Chalmers, Data- och informationsteknik, Datorteknik

Lingqi Zhang

Tokyo Institute of Technology

Peng Chen

National Institute of Advanced Industrial Science and Technology (AIST)

Aleksandr Drozd

RIKEN

Satoshi Matsuoka

RIKEN

Transactions on Architecture and Code Optimization

1544-3566 (ISSN) 1544-3973 (eISSN)

Vol. 20 4 57

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier

Datorsystem

DOI

10.1145/3629520

Mer information

Senast uppdaterat

2024-01-12