A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips

This paper introduces a new, practical routing algorithm, Maze-routing, to tolerate faults in network-on-chips. The algorithm is the first to provide all of the following properties at the same time: 1) fully-distributed with no centralized component, 2) guaranteed delivery (it guarantees to deliver packets when a path exists between nodes, or otherwise indicate that destination is unreachable, while being deadlock and livelock free), 3) low area cost, 4) low reconfiguration overhead upon a fault. To achieve all these properties, we propose Maze-routing, a new variant of face routing in on-chip networks and make use of deflections in routing. Our evaluations show that Maze-routing has 16X less area overhead than other algorithms that provide guaranteed delivery. Our Maze-routing algorithm is also high performance: for example, when up to 5 links are broken, it provides 50% higher saturation throughput compared to the state-of-the-art.

[1]  Nanning Zheng,et al.  Fault-tolerant routing for on-chip network without using virtual channels , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Coniferous softwood GENERAL TERMS , 2003 .

[3]  Jongman Kim,et al.  Do we need wide flits in Networks-on-Chip? , 2013, 2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[4]  Ivan Stojmenovic,et al.  On delivery guarantees of face and combined greedy-face routing in ad hoc and sensor networks , 2006, MobiCom '06.

[5]  Davide Bertozzi,et al.  Synergistic use of multiple on-chip networks for ultra-low latency and scalable distributed routing reconfiguration , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Kevin Kai-Wei Chang,et al.  HAT: Heterogeneous Adaptive Throttling for On-Chip Networks , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[7]  P. Baran,et al.  On Distributed Communications Networks , 1964 .

[8]  A. Singh,et al.  Fault-tolerant systems , 1990, Computer.

[9]  Valeria Bertacco,et al.  uDIREC: Unified diagnosis and reconfiguration for frugal bypass of NoC faults , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[10]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[11]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[12]  Alain Greiner,et al.  A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[13]  José Duato,et al.  An Efficient Implementation of Distributed Routing Algorithms for NoCs , 2008 .

[14]  Vijay Laxmi,et al.  d2-LBDR: Distance-driven routing to handle permanent failures in 2D mesh NoCs , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Chris Fallin,et al.  Next generation on-chip networks: what kind of congestion control do we need? , 2010, Hotnets-IX.

[16]  Alessandro Strano,et al.  OSR-Lite: Fast and deadlock-free NoC reconfiguration framework , 2012, 2012 International Conference on Embedded Computer Systems (SAMOS).

[17]  Srinivasan Seshan,et al.  On-chip networks from a networking perspective: congestion and scalability in many-core interconnects , 2012, CCRV.

[18]  Valeria Bertacco,et al.  Brisk and limited-impact NoC routing reconfiguration , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[19]  Natalie D. Enright Jerger,et al.  SCARAB: A single cycle adaptive routing and bufferless network , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[20]  Onur Mutlu,et al.  Concurrent autonomous self-test for uncore components in system-on-chips , 2010, 2010 28th VLSI Test Symposium (VTS).

[21]  Onur Mutlu,et al.  A case for bufferless routing in on-chip networks , 2009, ISCA '09.

[22]  Valentin Puente,et al.  Immunet: a cheap and robust fault-tolerant packet routing mechanism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[23]  Sharad Malik,et al.  Power-driven Design of Router Microarchitectures in On-chip Networks , 2003, MICRO.

[24]  Kevin Kai-Wei Chang,et al.  MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[25]  Paul Baran ON DISTRIBUTED COMMUNICATIONS: XI. SUMMARY OVERVIEW, , 1964 .

[26]  Alexandre M. Amory,et al.  Topology-agnostic fault-tolerant NoC routing method , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[27]  D. West Introduction to Graph Theory , 1995 .

[28]  Axel Jantsch,et al.  Addressing Transient and Permanent Faults in NoC With Efficient Fault-Tolerant Deflection Router , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Reetuparna Das,et al.  Design and Evaluation of Hierarchical Rings with Deflection Routing , 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing.

[30]  Li-Shiuan Peh,et al.  ARIADNE: Agnostic Reconfiguration in a Disconnected Network Environment , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[31]  David Blaauw,et al.  A highly resilient routing algorithm for fault-tolerant NoCs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[32]  Ivan Stojmenovic,et al.  Routing with Guaranteed Delivery in Ad Hoc Wireless Networks , 1999, DIALM '99.

[33]  Henry Hoffmann,et al.  On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.

[34]  Ivan Stojmenovic,et al.  Routing with Guaranteed Delivery in Ad Hoc Wireless Networks , 2001, Wirel. Networks.

[35]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[36]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[37]  Shekhar Y. Borkar Microarchitecture and Design Challenges for Gigascale Integration , 2004, MICRO.

[38]  Federico Silla,et al.  Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[39]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[40]  Axel Jantsch,et al.  Evaluation of on-chip networks using deflection routing , 2006, GLSVLSI '06.

[41]  Onur Mutlu,et al.  Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[42]  Kevin Kai-Wei Chang,et al.  Bufferless and Minimally-Buffered Deflection Routing , 2014 .

[43]  Chris Fallin,et al.  CHIPPER: A low-complexity bufferless deflection router , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.