Fault Tolerant Deadlock-Free Adaptive Routing Algorithms for Hexagonal Networks-on-Chip

Technology scaling has allowed the integration of a large number of cores on a single chip, which significantly improves the speed of on-chip processing. Network-on-chip is the interconnection network which provides efficient and flexible communication between cores in such multi-processor systems-on-chip. However, the performance enhancements of technology scaling come at the cost of reliability as on-chip components particularly the network-on-chip become increasingly prone to faults. Redundancy is the basic approach to fault tolerance and in this paper we investigate the hexagonal on-chip network topology with redundant diagonal inter-router links, having approximately 1.5 times the number of links as the mesh topology. To evaluate the fault tolerance of the hexagonal network with wormhole-switched routing, we present deadlock-free fault tolerant routing algorithms obtained by applying the turn model and without the use of costly virtual channels. To circumvent the problem of finding the right selection of turns to prevent deadlock, we propose an approach based on the transitive closure of the channel dependency matrix. The results indicate that the hexagonal NoC with the proposed adaptive routing algorithms significantly improves NoC resilience by being able to tolerate two router faults, while the mesh NoC can tolerate only one router fault. Moreover, the proposed approach is general and can be adopted for developing adaptive routing algorithms for any regular network topology.

[1]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[2]  Gerhard Fettweis,et al.  A Network-on-Chip Channel Allocator for Run-Time Task Scheduling in Multi-Processor System-on-Chips , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[3]  Tomohiro Yoneda,et al.  Improving Dependability and Performance of Fully Asynchronous On-chip Networks , 2011, 2011 17th IEEE International Symposium on Asynchronous Circuits and Systems.

[4]  David Blaauw,et al.  A highly resilient routing algorithm for fault-tolerant NoCs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[5]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[6]  Vincenzo Catania,et al.  A methodology for design of application specific deadlock-free routing algorithms for NoC systems , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[7]  Arash Shamaei,et al.  Adaptive routing in hexagonal torus interconnection networks , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[8]  Wen-Hsiang Hu,et al.  DMesh : a Diagonally-Linked Mesh Network-on-Chip Architecture , 2008 .

[9]  Lionel M. Ni,et al.  The Turn Model for Adaptive Routing , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[10]  Shekhar Borkar Thousand Core ChipsA Technology Perspective , 2007, DAC 2007.

[11]  Duan Xin Fault-tolerant Wormhole Routing in Mesh , 2007 .

[12]  Lionel M. Ni,et al.  The turn model for adaptive routing , 1998, ISCA '98.

[13]  Axel Jantsch,et al.  Methods for fault tolerance in networks-on-chip , 2013, CSUR.

[14]  Jim Harkin,et al.  Low cost fault-tolerant routing algorithm for Networks-on-Chip , 2015, Microprocess. Microsystems.

[15]  Axel Jantsch,et al.  A reconfigurable fault-tolerant deflection routing algorithm based on reinforcement learning for network-on-chip , 2010, NoCArc '10.

[16]  Jie Zhang,et al.  Routing in Hexagonal Networks under a Corner-Based Addressing Scheme , 2006, IEICE Trans. Inf. Syst..

[17]  Masoud Daneshtalab,et al.  High Performance Fault-Tolerant Routing Algorithm for NoC-Based Many-Core Systems , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[18]  Shekhar Y. Borkar,et al.  Thousand Core ChipsA Technology Perspective , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[19]  Coniferous softwood GENERAL TERMS , 2003 .

[20]  Jie Wu,et al.  A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model , 2003, IEEE Trans. Computers.

[21]  Hannu Tenhunen,et al.  MAFA: Adaptive Fault-Tolerant Routing Algorithm for Networks-on-Chip , 2012, 2012 15th Euromicro Conference on Digital System Design.

[22]  David Blaauw,et al.  Vicis: A reliable network for unreliable silicon , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[23]  Masaru Fukushi,et al.  Fault-Tolerant Routing Algorithm for Network on Chip without Virtual Channels , 2009, 2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[24]  WuJie A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model , 2003 .

[25]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[26]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..