Reducing the performance overhead of resilient CMPs with substitutable resources

Alirad Malek; Stavros Tzilis; Danish Anis Khan; Ioannis Sourdis; G. Smaragdos; C. Strydis

doi:10.1109/DFT.2015.7315161

Reducing the performance overhead of resilient CMPs with substitutable resources
Paper in proceeding, 2015

Permanent faults on a chip are often tolerated using spare resources. In the past, sparing has been applied to Chip Multiprocessors (CMPs) at various granularities of substitutable units (SUs). Entire processors, pipeline stages or even individual functional units are isolated when faulty and replaced by spare ones using flexible, reconfigurable interconnects. Although spare resources increase systems fault tolerance, the extra delay imposed by the reconfigurable interconnects limits performance. In this paper, we study two options for dealing with this delay: (i) pipelining the reconfigurable interconnects and (ii) scaling down operating frequency. The former keeps a frequency close to the one of the baseline processor, but increases the number of cycles required for executing a program. The latter maintains the number of execution cycles constant, but requires a slower clock. We investigate the above performance tradeoff using an adaptive 4-core CMP design with substitutable pipeline stages. We retrieve post place and route results of different designs running two sets of benchmarks and evaluate their performance. Our experiments indicate that adding reconfigurable interconnects for wiring the SUs of a 4-core CMP pose significant delay increasing the critical path of the design almost by 3.5 times. On the other hand, pipelining the reconfigurable interconnects increases cycle time by 41% and - depending on the processor configuration - reduces performance overhead to 1.4-2.9× the execution time of the baseline.

Author

Alirad Malek

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Other publications Research

Stavros Tzilis

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Other publications Research

Danish Anis Khan

Chalmers, Computer Science and Engineering (Chalmers)

Other publications Research

Ioannis Sourdis

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Other publications Research

G. Smaragdos

Erasmus University Rotterdam

C. Strydis

Erasmus University Rotterdam

Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015

191-196
978-1-5090-0312-9 (ISBN)

Subject Categories (SSIF 2011)

Computer Engineering

Computer and Information Science

Areas of Advance

Information and Communication Technology

DOI

10.1109/DFT.2015.7315161

Publication data connected to DOI

ISBN

978-1-5090-0312-9

More information

Latest update

12/1/2020

Reducing the performance overhead of resilient CMPs with substitutable resources Paper in proceeding, 2015

Author

Alirad Malek

Stavros Tzilis

Danish Anis Khan

Ioannis Sourdis

G. Smaragdos

C. Strydis

Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015

Subject Categories (SSIF 2011)

Areas of Advance

DOI

ISBN

More information

Latest update

Reducing the performance overhead of resilient CMPs with substitutable resources
Paper in proceeding, 2015