A Low-Power and SEU-Tolerant Switch Architecture for Network on Chips

High reliability, high performance, low power consumption are the main objectives in the design of NoCs. These three design objectives are mostly conflicting and should be considered simultaneously in order to have an optimal design. This paper proposes a method based on duplicating the virtual channels of each NoC node as well as parity codes to prevent SEUs from producing erroneous data. The method is compared with two widely used SEU-tolerant methods i.e., the switch to switch and the end to end flow control methods, in terms of reliability, power consumption and performance. A flit level VHDL-based simulator and Synopsys power compiler tool have been used to extract experimental results. The simulation results show the same reliability for all three methods, while the proposed method shows the lowest power consumption and the highest performance almost in all traffic generation rates and all packet error rates.

[1]  P. Hazucha,et al.  Impact of CMOS technology scaling on the atmospheric neutron soft error rate , 2000 .

[2]  James S. Plank A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[3]  Jehoshua Bruck,et al.  Low density MDS codes and factors of complete graphs , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[4]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[5]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[6]  Luca Benini,et al.  Low power error resilient encoding for on-chip data buses , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[7]  Petru Eles,et al.  System-Level Design Techniques for Energy-Efficient Embedded Systems , 2003, Springer US.

[8]  Petru Eles,et al.  Simultaneous communication and processor voltage scaling for dynamic and leakage energy reduction in time-constrained systems , 2004, ICCAD 2004.

[9]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[10]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[11]  Lihao Xu,et al.  Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[12]  Ahmad Patooghy,et al.  Feedback Redundancy: A Power Efficient SEU-Tolerant Latch Design for Deep Sub-Micron Technologies , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[13]  Luca Benini,et al.  Analysis of error recovery schemes for networks on chips , 2005, IEEE Design & Test of Computers.

[14]  Chita R. Das,et al.  Exploring Fault-Tolerant Network-on-Chip Architectures , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[15]  Alexander Vardy,et al.  MDS array codes with independent parity symbols , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[16]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[17]  Li Shang,et al.  Dynamic voltage scaling with links for power optimization of interconnection networks , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[18]  M. Nicolaidis,et al.  Design for soft error mitigation , 2005, IEEE Transactions on Device and Materials Reliability.

[19]  Petru Eles,et al.  Fault and energy-aware communication mapping with guaranteed latency for applications implemented on NoC , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[20]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[21]  Axel Jantsch,et al.  A network on chip architecture and design methodology , 2002, Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002.

[22]  Axel Jantsch,et al.  A fault model notation and error-control scheme for switch-to-switch buses in a network-on-chip , 2003, First IEEE/ACM/IFIP International Conference on Hardware/ Software Codesign and Systems Synthesis (IEEE Cat. No.03TH8721).

[23]  Karam S. Chatha,et al.  Quality-of-service and error control techniques for network-on-chip architectures , 2004, GLSVLSI '04.

[24]  Randy H. Katz,et al.  Coding techniques for handling failures in large disk arrays , 2005, Algorithmica.

[25]  R.W. Brodersen,et al.  A dynamic voltage scaled microprocessor system , 2000, IEEE Journal of Solid-State Circuits.

[26]  Bashir M. Al-Hashimi,et al.  Combined time and information redundancy for SEU-tolerance in energy-efficient real-time systems , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  James S. Plank,et al.  A practical analysis of low-density parity-check erasure codes for wide-area storage applications , 2004, International Conference on Dependable Systems and Networks, 2004.

[28]  Yingtao Jiang,et al.  Fault-tolerant routing schemes in RDT(2,2,1)//spl alpha/-based interconnection network for networks-on-chip design , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).