Fault-tolerant network interface for spatial division multiplexing based Network-on-Chip

The progressive maturity of VLSI manufacturing technology is helping in integrating more and more processing elements and memory units on a single die to form a Multiprocessor System-On-Chip (MPSoC). Network-on-Chip (NoC) is adopted as communication backbone for most of these modern day multiprocessor systems. As complexity of these system scales, there has been a growing concern on the dependability of these processing and communication elements. In this paper, we propose a centralized hardware fault-tolerant network interface (NI) for NoCs based on spatial division multiplexing. Experiments show that the proposed design has better throughput than a non fault-tolerant design with only 18% area overhead. We also introduce an area optimized distributed fault-tolerant NI architecture which provides 50% more throughput than the centralized design for high fault rates.

[1]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[2]  Luca Benini,et al.  Analysis of error recovery schemes for networks on chips , 2005, IEEE Design & Test of Computers.

[3]  Luigi Carro,et al.  Fault tolerant mechanism to improve yield in NoCs using a reconfigurable router , 2009, SBCCI.

[4]  Ikhwan Lee,et al.  Survey of Error and Fault Detection Mechanisms , 2011 .

[5]  M. Ali,et al.  A dynamic routing mechanism for network on chip , 2005, 2005 NORCHIP.

[6]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[7]  Michael C. Huang,et al.  Variation-tolerant hierarchical voltage monitoring circuit for soft error detection , 2009, 2009 10th International Symposium on Quality Electronic Design.

[8]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[9]  Israel Koren,et al.  Fault-Tolerant Systems , 2007 .

[10]  Chouki Aktouf,et al.  A complete strategy for testing an on-chip multiprocessor architecture , 2002, IEEE Design & Test of Computers.

[11]  Pasi Liljeberg,et al.  Multi network interface architectures for fault tolerant Network-on-Chip , 2009, 2009 International Symposium on Signals, Circuits and Systems.

[12]  Steve B. Furber Living with Failure: Lessons from Nature? , 2006, ETS.

[13]  Ahmad Khademzadeh,et al.  Fault-Tolerant Application-Specific Network-on-Chip , 2011 .

[14]  Mahmut T. Kandemir,et al.  Fault tolerant algorithms for network-on-chip interconnect , 2004, IEEE Computer Society Annual Symposium on VLSI.

[15]  S. Borkar,et al.  An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS , 2008, IEEE Journal of Solid-State Circuits.

[16]  Alexandre M. Amory,et al.  A scalable test strategy for network-on-chip routers , 2005, IEEE International Conference on Test, 2005..

[17]  Ching-Te Chiu,et al.  On the design and analysis of fault tolerant NoC architecture using spare routers , 2011, 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011).

[18]  David Blaauw,et al.  Vicis: A reliable network for unreliable silicon , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[19]  W. Arden The International Technology Roadmap for Semiconductors—Perspectives and challenges for the next 15 years , 2002 .

[20]  Mariagiovanna Sami,et al.  Design of Fault Tolerant Network Interfaces for NoCs , 2011, 2011 14th Euromicro Conference on Digital System Design.

[21]  Lorena Anghel,et al.  Essential Fault-Tolerance Metrics for NoC Infrastructures , 2007, 13th IEEE International On-Line Testing Symposium (IOLTS 2007).

[22]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[23]  Yajun Ha,et al.  An area-efficient dynamically reconfigurable Spatial Division Multiplexing network-on-chip with static throughput guarantee , 2010, 2010 International Conference on Field-Programmable Technology.

[24]  Paolo Prinetto,et al.  Reliability in Application Specific Mesh-Based NoC Architectures , 2008, 2008 14th IEEE International On-Line Testing Symposium.

[25]  Arnaud Virazel,et al.  Using TMR Architectures for Yield Improvement , 2008, 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems.

[26]  Diederik Verkest,et al.  Concepts and Implementation of Spatial Division Multiplexing for Guaranteed Throughput in Networks-on-Chip , 2008, IEEE Transactions on Computers.