Towards Accurate Estimation of Error Sensitivity in Computer Systems
Doktorsavhandling, 2021
Using ISA-level fault injection, we investigate how five aspects, or factors, influence the error sensitivity of a program. We define error sensitivity as the conditional probability that a bit-flip error in live data in an ISA-register or main-memory word will cause a program to produce silent data corruption (SDC; i.e., an erroneous result). We also consider the estimation of a measure called SDC count, which represents the number of ISA-level bit flips that cause an SDC.
The five factors addressed are (a) the inputs processed by a program, (b) the level of compiler optimization, (c) the implementation of the program in the source code, (d) the fault model (single bit flips vs double bit flips) and (e)the fault-injection technique (inject-on-write vs inject-on-read). Our results show that these factors affect the error sensitivity in many ways; some factors strongly impact the error sensitivity or SDC count whereas others show a weaker impact. For example, our experiments show that single bit flips tend to cause SDCs more than double bit flips; compiler optimization positively impacts the SDC count but not necessarily the error sensitivity; the error sensitivity varies between 20% and 50% among the programs we tested; and variations in input affect the error sensitivity significantly for most of the tested programs.
silent data corruption
error sensitivity
fault injection
soft errors
Författare
Fatemeh Ayatolahi
Chalmers, Data- och informationsteknik, Datorteknik
A Study of the Impact of Single Bit-Flip and Double Bit- Flip Errors on Program Execution
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 8153 LNCS(2013)
Paper i proceeding
A Comparison of Inject-on-Read and Inject-on-Write in ISA-Level Fault Injection
11TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE,;(2016)p. 178-189
Paper i proceeding
On the Impact of Hardware Faults – An Investigation of the Relationship between Workload Inputs and Failure Mode Distributions
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 7612(2012)p. 198-209
Paper i proceeding
A Study of the Impact of Bit-flip Errors on Programs Compiled with Different Optimization Levels
10th European Dependable Computing Conference, EDCC 2014; Newcastle upon Tyne; United Kingdom; 13 May 2014 through 16 May 2014;,;(2014)p. 146-157
Paper i proceeding
Back-to-Back Fault Injection Testing in Model-Based Development
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 9337(2015)p. 135-148
Paper i proceeding
Ämneskategorier
Data- och informationsvetenskap
Beräkningsmatematik
Elektroteknik och elektronik
Datavetenskap (datalogi)
ISBN
978-91-7905-493-9
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4960
Utgivare
Chalmers
EDIT 8103
Opponent: Associate professor Juan Carlos Ruiz, Technical University of Valencia, Spain