The DeSyRe runtime support for fault-tolerant embedded MPSoCs
Paper in proceedings, 2014
Semiconductor technology scaling makes chips moresensitive to faults. This paper describes the DeSyRe designapproach and its runtime management for future reliable embedded Multiprocessor Systems-on-Chip (MPSoCs). A light weight runtime system is described for shared-memory MPSoCs to support fault-tolerant execution upon detection of transient and permanent faults. The DeSyRe runtime system offers re-execution of tasks that suffer from transient faults and task-migration in cases where a worker processor is permanently faulty. In addition, a faulty worker can potentially remainusable, increasing systems fault-tolerance. This is achieved using alternative task implementations, which avoid the faulty circuit and are indicated in the application-code via pragma annotations, as well as by repairing a faulty core via hardware reconfiguration. Thereby, the system can be dynamically adapted using one ormultiple of the above mechanisms to mitigate faults. The DeSyReruntime system is evaluated using micro-benchmarks running ona Virtex-6 FPGA MPSoC. Results suggest that our enhance dfault-tolerant runtime system can successfully and efficiently execute all application tasks under a variety of fault cases.