Wearout Resilience in NoCs Through an Aging Aware Adaptive Routing Algorithm

Continuous technology scaling has made aging mechanisms, such as negative bias temperature instability and electromigration primary concerns in network-on-chip (NoC) designs. In this paper, we extensively analyze the effects of these aging mechanisms on NoC routers and links. We observe a critical need of a robust aging-aware routing algorithm that not only reduces power-performance overheads caused due to aging degradation, but also minimizes the stress experienced by heavily utilized routers and links. To solve this problem, we propose an aging-aware adaptive routing algorithm and a router microarchitecture that routes the packets along the paths, which are both least congested and experience minimum aging degradation. After an extensive experimental analysis using real workloads, we observe 13% and 12.17% average overhead reduction in network latency and energy-delay product per flit, a 10.4% improvement in performance, and a 60% improvement in mean time to failure using our aging-aware routing algorithm.

[1]  Federico Silla,et al.  A methodology for the characterization of process variation in NoC links , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[2]  David Blaauw,et al.  Compact In-Situ Sensors for Monitoring Negative-Bias-Temperature-Instability Effect and Oxide Degradation , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[3]  Stephen W. Keckler,et al.  Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[4]  Radu Marculescu,et al.  FARM: Fault-aware resource management in NoC-based multiprocessor platforms , 2011, 2011 Design, Automation & Test in Europe.

[5]  Yu Hen Hu,et al.  A Fault-Tolerant NoC Scheme using bidirectional channel , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  William J. Dally,et al.  Research Challenges for On-Chip Interconnection Networks , 2007, IEEE Micro.

[7]  Nacer-Eddine Zergainoh,et al.  A fault-tolerant deadlock-free adaptive routing for on chip interconnects , 2011, 2011 Design, Automation & Test in Europe.

[8]  Tao Li,et al.  Architecting reliable multi-core network-on-chip for small scale processing technology , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[9]  Mahmut T. Kandemir,et al.  Process variation-aware routing in NoC based multicores , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Howard Jay Siegel,et al.  OE+IOE: A novel turn model based fault tolerant routing scheme for networks-on-chip , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[11]  Sanghamitra Roy,et al.  Towards graceful aging degradation in NoCs through an adaptive routing algorithm , 2012, DAC Design Automation Conference 2012.

[12]  Paolo A. Aseron,et al.  A 45 nm Resilient Microprocessor Core for Dynamic Variation Tolerance , 2011, IEEE Journal of Solid-State Circuits.

[13]  Priyadarsan Patra,et al.  Impact of Process and Temperature Variations on Network-on-Chip Design Exploration , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[14]  David Blaauw,et al.  A highly resilient routing algorithm for fault-tolerant NoCs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[15]  Chrysostomos Nicopoulos,et al.  NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[16]  Luca Benini,et al.  A multi-path routing strategy with guaranteed in-order packet delivery and fault-tolerance for networks on chip , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[17]  Coniferous softwood GENERAL TERMS , 2003 .

[18]  Valeria Bertacco,et al.  Formally enhanced runtime verification to ensure NoC functional correctness , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Chita R. Das,et al.  A case for heterogeneous on-chip interconnects for CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[20]  Zhiyi Yu,et al.  A Scalable and Reconfigurable Fault-Tolerant Distributed Routing Algorithm for NoCs , 2011, IEICE Trans. Inf. Syst..

[21]  Radu Marculescu,et al.  Hitting Time Analysis for Fault-Tolerant Communication at Nanoscale in Future Multiprocessor Platforms , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Petru Eles,et al.  Fault and energy-aware communication mapping with guaranteed latency for applications implemented on NoC , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[23]  Janet Roveda,et al.  Adaptive inter-router links for low-power, area-efficient and reliable Network-on-Chip (NoC) architectures , 2009, 2009 Asia and South Pacific Design Automation Conference.