High Performance Hybrid Memory Systems with 3D-stacked DRAM
Licentiatavhandling, 2019
with the increasing demand of data intensive workloads limiting performance.
3D-stacked DRAM can alleviate this problem providing substantially higher
bandwidth to a processor chip. However, the capacity of 3D-stacked DRAM is
not enough to replace the bulk of the memory and therefore it is used either
as a DRAM cache or as part of a flat address space with support for data
migration. The performance of both above alternative designs is limited by
their particular overheads. In this thesis we propose designs that improve
the performance of hybrid memory systems in which 3D-stacked DRAM is
used either as a cache or as part of a flat address space with data migration.
DRAM caches have shown excellent potential in capturing the spatial and
temporal data locality of applications, however they are still far from their ideal
performance. Besides the unavoidable DRAM access to fetch the requested
data, tag access is in the critical path adding significant latency and energy
costs. Existing approaches are not able to remove these overheads and in
some cases limit DRAM cache design options. To alleviate the tag access
overheads of DRAM caches this thesis proposes Decoupled Fused Cache (DFC),
a DRAM cache design that fuses DRAM cache tags with the tags of the on-chip
Last Level Cache (LLC) to access the DRAM cache data directly on LLC
misses. Compared to current state-of-the-art DRAM caches, DFC improves
system performance by 6% on average and by 16-18% for large cacheline sizes.
Finally, DFC reduces DRAM cache traffic by 18% and DRAM cache energy
consumption by 7%. Data migration schemes have significant performance
potential, but also entail overheads, which may diminish migration benefits
or even lead to performance degradation. These overheads are mainly due to
the high cost of swapping data between memories which also makes selecting
which data to migrate critical to performance. To address these challenges
of data migration this thesis proposes LLC guided Data Migration (LGM).
LGM uses the LLC to predict future reuse and select memory segments for
migration. Furtermore, LGM reduces the data migration traffic overheads by
not migrating the cache lines of memory segments which are present in the
LLC. LGM outperforms current state-of-the art migration designs improving
system performance by 12.1% and reducing memory system dynamic energy
by 13.2%.
Hybrid memory systems
DRAM caches
Data migration
3D-stacked DRAM
Författare
Evangelos Vasilakis
Chalmers, Data- och informationsteknik, Datorteknik
Decoupled fused cache: Fusing a decoupled LLC with a DRAM cache
Transactions on Architecture and Code Optimization,;Vol. 15(2019)
Artikel i vetenskaplig tidskrift
Ämneskategorier
Datorteknik
Datavetenskap (datalogi)
Datorsystem
Styrkeområden
Informations- och kommunikationsteknik
Infrastruktur
C3SE (Chalmers Centre for Computational Science and Engineering)
Utgivare
Chalmers
Room EA, EDIT building, Rännvägen 6, Chalmers University of Technology, Campus Johanneberg
Opponent: Prof. Yale Patt University of Texas at Austin, U.S.A