On Impact and Tolerance of Data Errors with Varied Duration in Microprocessors
Doctoral thesis, 2003
The evolution of high-performance and low-cost microprocessors has led to their almost pervasive usage in embedded systems such as automotive electronics, smart gadgets, communication devices, etc. These mass-market products, when deployed in safety-critical systems, require safe services albeit at low recurring costs. Moreover, as these systems often operate in harsh environments, faults will occur during system operation, and thus, must be handled safely, i.e., tolerated.
This thesis investigates the efficiency of adding software-implemented fault tolerance techniques to commercial off-the-shelf (COTS) microprocessors. Specifically, the following problems are addressed:
Which faults need to be tolerated considering the architecture, implementation and operational environments for COTS processors?
Which software-implemented fault-tolerance techniques are effective and efficient to use?
How can the efficiencies of such designs be evaluated?
The main contribution of this thesis is the development of novel approaches for estimating the effects of data errors with varied duration, and for ascertaining the efficiency of applied fault-tolerance techniques. These approaches are based on identifying the characteristics that determine which effects data errors will have on the system. Then these characteristics can be varied at a high abstraction level and the effects observed.
The first approach is based on response analysis methods for understanding the effects of data errors on control systems. The second is a VHDL simulation-based fault injection method, based on insertion of specific components (so-called saboteurs) for varying the characteristics. As most system development processes start at a high abstraction level, we expect our approaches to be applied early in the process, and be a useful complement to traditional post-design assessment approaches such as fault-injection.
error propagation analysis
error effect analysis