ATARDS: An adaptive fault-tolerant strategy to cope with massive defects in Network-on-Chip interconnections

The use of embedded fault-tolerant mechanisms in Network-on-Chips (NoCs) has become essential to ensure connectivity in the presence of massive defects, and consequently improving the yield. According to the number of defects and their location in NoC, the fault tolerant techniques can be very expensive in terms of area, performance and energy overhead. The use of testing and diagnosis can help to minimize costs associated to embedded fault tolerant mechanisms because they can be adapted to work only at the defect regions. Our fault tolerant strategy is based on adaptive routing and data splitting to cope with massive defects in NoC interconnections. The combination of these two techniques adds significant improvements in reliability and energy efficiency. Experimental results with random massive interconnection faults have shown that our proposal can still sustain 100% of connectivity with 60% of defected wires. The energy penalty may vary from only 5 to up to 40% as a function of the number of faulty interconnections, which is much less overhead compared to techniques as hamming code.

[1]  Luigi Carro,et al.  Adaptive approach to tolerate multiple faulty links in Network-on-Chip , 2011, 2011 12th Latin American Test Workshop (LATW).

[2]  André DeHon,et al.  Seven strategies for tolerating highly defective fabrication , 2005, IEEE Design & Test of Computers.

[3]  Vishwani D. Agrawal Testing for faults, looking for defects , 2011, 2011 12th Latin American Test Workshop (LATW).

[4]  Christos A. Papachristou,et al.  A method for detecting interconnect DSM defects in systems on chip , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Lei He,et al.  Distributed sleep transistor network for power reduction , 2003, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  K. Shi,et al.  Sleep Transistor Design and Implementation - Simple Concepts Yet Challenges To Be Optimum , 2006, 2006 International Symposium on VLSI Design, Automation and Test.

[7]  Luigi Carro,et al.  Dependable Network-on-Chip Router Able to Simultaneously Tolerate Soft Errors and Crosstalk , 2006, 2006 IEEE International Test Conference.

[8]  Juan M. Orduña,et al.  A multi-objective strategy for concurrent mapping and routing in networks on chip , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[9]  T. Sakurai,et al.  Approximation of wiring delay in MOSFET LSI , 1983, IEEE Journal of Solid-State Circuits.

[10]  Vincenzo Catania,et al.  Leveraging Partially Faulty Links Usage for Enhancing Yield and Performance in Networks-on-Chip , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Marcelo Lubaszewski,et al.  Improving yield of torus nocs through fault-diagnosis-and-repair of interconnect faults , 2009, 2009 15th IEEE International On-Line Testing Symposium.

[12]  Partha Pratim Pande,et al.  Crosstalk-Aware Channel Coding Schemes for Energy Efficient and Reliable NOC Interconnects , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Hideharu Amano,et al.  A Lightweight Fault-Tolerant Mechanism for Network-on-Chip , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[14]  Altamiro Amadeu Susin,et al.  SoCIN: a parametric and scalable network-on-chip , 2003, 16th Symposium on Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings..

[15]  Altamiro Amadeu Susin,et al.  RASoC: a router soft-core for networks-on-chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[16]  Luca Benini,et al.  ReliNoC: A reliable network for priority-based on-chip communication , 2011, 2011 Design, Automation & Test in Europe.

[17]  Pasi Liljeberg,et al.  Online Reconfigurable Self-Timed Links for Fault Tolerant NoC , 2007, VLSI Design.

[18]  Marcelo Lubaszewski,et al.  Efficiently using data splitting and retransmission to tolerate faults in networks-on-chip interconnects , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.