Odd-ECC: On-demand DRAM error correcting codes
Paper in proceeding, 2017

An application may have different sensitivity to faults in different subsets of the data it uses. Some data regions may therefore be more critical than others. Capitalizing on this observation, Odd-ECC provides a mechanism to dynamically select the memory fault tolerance of each allocated page of a program on demand depending on the criticality of the respective data. Odd-ECC error correcting codes (ECCs) are stored in separate physical pages and hidden by the OS as pages unavailable to the user. Still, these ECCs are physically aligned with the data they protect so the memory controller can efficiently access them. Thereby, capacity, performance and energy overheads of memory fault tolerance are proportional to the criticality of the data stored. Odd-ECC is applied to memory systems that use conventional 2D DRAM DIMMs as well as to 3D-stacked DRAMs and evaluated using various applications. Compared to flat memory protection schemes, Odd-ECC substantially reduces ECCs capacity overheads while achieving the same Mean Time to Failure (MTTF) and in addition it slightly improves performance and energy costs. Under the same capacity constraints, Odd-ECC achieves substantially higher MTTF, compared to a flat memory protection. This comes at a performance and energy cost, which is however still a fraction of the cost introduced by a flat equally strong scheme.

3D-Stacked memory

Error correcting codes

DRAM

Main memory reliability

Applications reliability analysis

Author

Alirad Malek

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Evangelos Vasilakis

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Vasileios Papaefstathiou

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Pedro Petersen Moura Trancoso

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

University of Cyprus

Ioannis Sourdis

Institute of Computer Science Crete

ACM International Conference Proceeding Series

Vol. Part F131197 96-101
9781450353359 (ISBN)

Subject Categories

Computer and Information Science

DOI

10.1145/3132402.3132443

ISBN

9781450353359

More information

Latest update

3/21/2023