Resilient routing implementation in 2D mesh NoC

With the rapid shrinking of technology and growing integration capacity, the probability of failures in Networks-on-Chip (NoCs) increases and thus, fault tolerance is essential. Moreover, the unpredictable locations of these failures may influence the regularity of the underlying topology, and a regular 2D mesh is likely to become irregular. Thus, for these failure-prone networks, a viable routing framework should comprise a topology-agnostic routing algorithm along with a cost-effective, scalable routing mechanism able to handle failures, irrespective of any particular failure patterns. Existing routing techniques designed to route irregular topologies efficiently lack flexibility (logic-based), scalability (table-based) or relaxed switch design (uLBDR-based). Designing an efficient routing implementation technique to address irregular topologies remains a pressing research problem. To address this, we present a fault resilient routing mechanism for irregular 2D meshes resulting from failures. To handle irregularities, it avoids using routing tables and employs a few fixed configuration bits per switch resulting in a scalable approach. Experiments demonstrate that the proposed approach is guaranteed to tolerate all locations of single and double-link failures and most multiple failures. Also, unlike uLBDR it is not restricted to any particular switching technique and does not replicate any extra messages. Along with fault tolerance, the proposed mechanism can achieve better network performance in fault-free cases. The proposed technique achieves graceful performance degradation during failure. Compared to uLBDR, our method has 14% less area requirements and 16% less overall power consumption.

[1]  Wolfgang Rosenstiel,et al.  Fully Adaptive Fault-Tolerant Routing Algorithm for Network-on-Chip Architectures , 2007, 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools (DSD 2007).

[2]  Antonio Robles,et al.  An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori , 2004, IEEE Computer Architecture Letters.

[3]  David Blaauw,et al.  Vicis: A reliable network for unreliable silicon , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[4]  Daniel H. Linder,et al.  An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-Ary n-Cubes , 1994, IEEE Trans. Computers.

[5]  Axel Jantsch,et al.  Addressing Transient and Permanent Faults in NoC With Efficient Fault-Tolerant Deflection Router , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Federico Silla,et al.  Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Yong-Bin Kim,et al.  Fault Tolerant Source Routing for Network-on-chip , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[8]  Michele Favalli,et al.  A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs , 2013, TECS.

[9]  Andrés Mejía Gómez Design and implementation of efficient topology agnostic routing algorithms for interconnection networks , 2008 .

[10]  Wolfgang Rosenstiel,et al.  Fully Adaptive Fault-Tolerant Routing Algorithm for Network-on-Chip Architectures , 2007 .

[11]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[12]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[13]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[14]  Valentin Puente,et al.  Immunet: a cheap and robust fault-tolerant packet routing mechanism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[15]  José Duato,et al.  An Efficient Implementation of Distributed Routing Algorithms for NoCs , 2008 .

[16]  Shekhar Y. Borkar,et al.  Microarchitecture and Design Challenges for Gigascale Integration , 2004, MICRO.

[17]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[18]  Antonio Robles,et al.  A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms , 2012, IEEE Transactions on Parallel and Distributed Systems.

[19]  Larry J. Stockmeyer,et al.  A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers , 2004, Proceedings 16th International Parallel and Distributed Processing Symposium.

[20]  José Duato,et al.  An Efficient Implementation of Distributed Routing Algorithms for NoCs , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[21]  William J. Dally,et al.  The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers , 1994, PCRCW.

[22]  José Duato,et al.  On the Potentials of Segment-Based Routing for NoCs , 2008, 2008 37th International Conference on Parallel Processing.

[23]  Luca Benini,et al.  Networks on chips - technology and tools , 2006, The Morgan Kaufmann series in systems on silicon.

[24]  S. Medardoni,et al.  Flexible DOR routing for virtualization of multicore chips , 2009, 2009 International Symposium on System-on-Chip.

[25]  Li-Shiuan Peh,et al.  ARIADNE: Agnostic Reconfiguration in a Disconnected Network Environment , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[26]  David Blaauw,et al.  A highly resilient routing algorithm for fault-tolerant NoCs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.