The DeSyRe runtime support for fault-tolerant embedded MPSoCs
Semiconductor technology scaling makes chips moresensitive to faults. This paper describes the DeSyRe designapproach and its runtime management for future reliable embedded Multiprocessor Systems-on-Chip (MPSoCs). A light weight runtime system is described for shared-memory MPSoCs to support fault-tolerant execution upon detection of transient and permanent faults. The DeSyRe runtime system offers re-execution of tasks that suffer from transient faults and task-migration in cases where a worker processor is permanently faulty. In addition, a faulty worker can potentially remainusable, increasing systems fault-tolerance. This is achieved using alternative task implementations, which avoid the faulty circuit and are indicated in the application-code via pragma annotations, as well as by repairing a faulty core via hardware reconfiguration. Thereby, the system can be dynamically adapted using one ormultiple of the above mechanisms to mitigate faults. The DeSyReruntime system is evaluated using micro-benchmarks running ona Virtex-6 FPGA MPSoC. Results suggest that our enhance dfault-tolerant runtime system can successfully and efficiently execute all application tasks under a variety of fault cases.

