Graceful Degradation of Adaptive Multiprocessor Systems on a Chip
Licentiate thesis, 2015

This thesis explores the potential for using existing flexibility in order to allow Multiprocessor Systems on a Chip to function in the presence of permanent faults and to prolong their lifetime. Technology scaling in accordance with Moore’s law brings up reliability challenges and forces the use of potentially unreliable hardware components. However, hardware reconfigurability and workload flexibility can provide the means for permanent fault tolerance and graceful degradation via runtime system management. This work first elaborates on the concept of degradable hardware components and presents a methodology for characterizing each of their possible configurations. This is necessary if we intend to use this reconfigurability in an efficient manner to work around permanent faults. The characterization methodology is used to perform design space exploration aiming to find the optimal reconfiguration granularity for any given fault density. Subsequently, Graceful Degradation of Multiprocessor Systems on a Chip is defined as an optimization problem. Three algorithms are proposed for solving this problem within reasonable time, in order to be applicable at runtime: A novel fast heuristic based on incremental and partially precomputed solutions, and our versions of two well established search algorithms (simulated annealing and genetic algorithm), tailored to the particular problem. The fast heuristic is proven able to find a solution on average 81.9% as good as the exhaustively sought optimal one, in less than 2μsec on average. Our versions of simulated annealing and genetic algorithm find on average better solutions (86.6% and 89.6% as good as the optimal respectively), at the cost of one and two orders of magnitude slower execution time.

design space exploration

runtime management

graceful degradation

system optimization

Defect and fault tolerance

Lecture room ED, EDIT building, Rännvägen 6, Johanneberg campus
Opponent: Professor Cristiana Bolchini, Politecnico di Milano, Italy

Author

Stavros Tzilis

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

A dependable coarse-grain reconfigurable multicore array

Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS,; (2014)p. 141-150

Paper in proceeding

A runtime manager for gracefully degrading SoCs

Proceedings - IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems,; (2014)p. 216-221

Paper in proceeding

A Probabilistic Analysis of Resilient Reconfigurable Designs

27th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2014, Amsterdam, Netherlands, 1-3 October 2014,; (2014)p. 141-146

Paper in proceeding

Areas of Advance

Information and Communication Technology

Subject Categories

Embedded Systems

Computer Systems

Technical report L - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University: 128L

Lecture room ED, EDIT building, Rännvägen 6, Johanneberg campus

Opponent: Professor Cristiana Bolchini, Politecnico di Milano, Italy

More information

Created

10/7/2017