Fault-Tolerant Network-On-Chip Router Architecture Design for Heterogeneous Computing Systems in the Context of Internet of Things

Network-on-chip (NoC) architectures have become a popular communication platform for heterogeneous computing systems owing to their scalability and high performance. Aggressive technology scaling makes these architectures prone to both permanent and transient faults. This study focuses on the tolerance of a NoC router to permanent faults. A permanent fault in a NoC router severely impacts the performance of the entire network. Thus, it is necessary to incorporate component-level protection techniques in a router. In the proposed scheme, the input port utilizes a bypass path, virtual channel (VC) queuing, and VC closing strategies. Moreover, the routing computation stage utilizes spatial redundancy and double routing strategies, and the VC allocation stage utilizes spatial redundancy. The switch allocation stage utilizes run-time arbiter selection. The crossbar stage utilizes a triple bypass bus. The proposed router is highly fault-tolerant compared with the existing state-of-the-art fault-tolerant routers. The reliability of the proposed router is 7.98 times higher than that of the unprotected baseline router in terms of the mean-time-to-failure metric. The silicon protection factor metric is used to calculate the protection ability of the proposed router. Consequently, it is confirmed that the proposed router has a greater protection ability than the conventional fault-tolerant routers.

[1]  M.A. Alam,et al.  Mechanism of negative bias temperature instability in CMOS devices: degradation, recovery and impact of nitrogen , 2004, IEDM Technical Digest. IEEE International Electron Devices Meeting, 2004..

[2]  Xi Chen,et al.  A Hybrid Cyber Attack Model for Cyber-Physical Power Systems , 2020, IEEE Access.

[3]  Chrysostomos Nicopoulos,et al.  NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[4]  Shen Yin,et al.  Real-Time Monitoring and Control of Industrial Cyberphysical Systems: With Integrated Plant-Wide Monitoring and Control Framework , 2019, IEEE Industrial Electronics Magazine.

[5]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Lu Wang,et al.  A High Performance Reliable NoC Router , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[7]  Yousaf Bin Zikria,et al.  Tactile Internet: Technologies, test platforms, trials, and applications , 2020, Future Gener. Comput. Syst..

[8]  Okyay Kaynak,et al.  Data-Driven Monitoring and Safety Control of Industrial Cyber-Physical Systems: Basics and Beyond , 2018, IEEE Access.

[9]  Yahui Meng,et al.  Enabling the content dissemination through caching in the state-of-the-art sustainable information and communication technologies , 2020 .

[10]  Athanasios V. Vasilakos,et al.  The Future of Healthcare Internet of Things: A Survey of Emerging Technologies , 2020, IEEE Communications Surveys & Tutorials.

[11]  J. F. Ziegler,et al.  Terrestrial cosmic ray intensities , 1998, IBM J. Res. Dev..

[12]  Chita R. Das,et al.  A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[13]  Niraj K. Jha,et al.  GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[14]  David Blaauw,et al.  Vicis: A reliable network for unreliable silicon , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[15]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[16]  Coniferous softwood GENERAL TERMS , 2003 .

[17]  Ahmed Louri,et al.  Shield: A Reliable Network-on-Chip Router Architecture for Chip Multiprocessors , 2016, IEEE Transactions on Parallel and Distributed Systems.

[18]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[19]  B. Hoefflinger ITRS: The International Technology Roadmap for Semiconductors , 2011 .

[20]  R.H. Dennard,et al.  Alpha-Particle-Induced Soft Error Rate in VLSI Circuits , 1982, IEEE Journal of Solid-State Circuits.

[21]  Mojtaba Valinataj,et al.  A low-cost, fault-tolerant and high-performance router architecture for on-chip networks , 2016, Microprocess. Microsystems.

[22]  Scott A. Mahlke,et al.  BulletProof: a defect-tolerant CMP switch architecture , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[23]  Lei Xie,et al.  REPAIR: A Reliable Partial-Redundancy-Based Router in NoC , 2013, 2013 IEEE Eighth International Conference on Networking, Architecture and Storage.

[24]  Slimane Oussalah,et al.  On the oxide thickness dependence of the time-dependent-dielectric-breakdown , 1999, Proceedings 1999 IEEE Hong Kong Electron Devices Meeting (Cat. No.99TH8458).

[25]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[26]  K.J. Kuhn,et al.  Reducing Variation in Advanced Logic Technologies: Approaches to Process and Design for Manufacturability of Nanoscale CMOS , 2007, 2007 IEEE International Electron Devices Meeting.

[27]  Yousaf Bin Zikria,et al.  Internet of Multimedia Things (IoMT): Opportunities, Challenges and Solutions , 2020, Sensors.

[28]  Ahmed Louri,et al.  An Improved Router Design for Reliable On-Chip Networks , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[29]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[30]  Masoud Daneshtalab,et al.  Defender: A Low Overhead and Efficient Fault-Tolerant Mechanism for Reliable on-Chip Router , 2019, IEEE Access.

[31]  Emanuele Garone,et al.  Feasibility and Detection of Replay Attack in Networked Constrained Cyber-Physical Systems , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Hannu Tenhunen,et al.  Partial Virtual Channel Sharing: A Generic Methodology to Enhance Resource Management and Fault Tolerance in Networks-on-Chip , 2013, J. Electron. Test..

[33]  Mayler G. A. Martins,et al.  Open Cell Library in 15nm FreePDK Technology , 2015, ISPD.

[34]  Shekhar Y. Borkar,et al.  Design challenges of technology scaling , 1999, IEEE Micro.

[35]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[36]  D. P. Gaver,et al.  Time to Failure and Availability of Paralleled Systems with Repair , 1963 .

[37]  Guido Groeseneken,et al.  Hot carrier degradation and ESD in submicrometer CMOS technologies: how do they interact? , 2001 .

[38]  Yousaf Bin Zikria,et al.  NoCGuard: A Reliable Network-on-Chip Router Architecture , 2020, Electronics.