An Analysis of Connectivity and Yield for 2D Mesh Based NoC with Interconnect Router Failures

The manufacturing process of modern day processors is both costly and complex and there are many different factors that influence the quality of a chip when it comes off the production line. Typically, hundreds of chips are manufactured from a single silicon wafer and as we go deeper into the sub-micron era of microchip manufacturing, the potential for defects during production increases. The advent of multi-core computing may introduce problems related to connectivity and yield for high volume manufacturing (HVM). In this paper we explore potential benefits that fault tolerant routing provides within the NoC (network-on-chip) paradigm with a study of the relationship between connectivity and yield at the interconnect routing level. For dimension-order routing based mesh NoCs, we describe two methods that are logically straightforward to implement and that can be used to increase the yield of chips with interconnect router faults.

[1]  Mahmut T. Kandemir,et al.  Fault tolerant algorithms for network-on-chip interconnect , 2004, IEEE Computer Society Annual Symposium on VLSI.

[2]  Pedro López,et al.  Boosting the Performance of Myrinet Networks , 2002, IEEE Trans. Parallel Distributed Syst..

[3]  Suresh Chalasani,et al.  Communication in Multicomputers with Nonconvex Faults , 1995, IEEE Trans. Computers.

[4]  Olav Lysne,et al.  Layered routing in irregular networks , 2006, IEEE Transactions on Parallel and Distributed Systems.

[5]  Olav Lysne,et al.  Layered shortest path (LASH) routing in irregular system area networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[6]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[7]  Antonio Robles,et al.  A routing methodology for achieving fault tolerance in direct networks , 2006, IEEE Transactions on Computers.

[8]  José Duato A Theory to Increase the Effective Redundancy in Wormhole Networks , 1994, Parallel Process. Lett..

[9]  Jie Wu,et al.  A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model , 2003, IEEE Trans. Computers.

[10]  Ge-Ming Chiu,et al.  A Fault-Tolerant Routing Scheme for Meshes with Nonconvex Faults , 2001, IEEE Trans. Parallel Distributed Syst..

[11]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[12]  Peter Y. K. Cheung,et al.  Analysis of yield loss due to random photolithographic defects in the interconnect structure of FPGAs , 2005, FPGA '05.

[13]  José Duato,et al.  Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[14]  William J. Dally,et al.  Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..

[15]  Tor Skeie,et al.  Handling Multiple Faults in Wormhole Mesh Networks , 1998, Euro-Par.

[16]  Antonio Robles,et al.  An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori , 2004, IEEE Computer Architecture Letters.

[17]  Antonio Robles,et al.  Effective methodology for deadlock-free minimal routing in InfiniBand networks , 2002, Proceedings International Conference on Parallel Processing.

[18]  Andrew A. Chien,et al.  Planar-adaptive routing: low-cost adaptive networks for multiprocessors , 1992, ISCA '92.

[19]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[20]  Lionel M. Ni,et al.  Fault-tolerant routing in hypercube multicomputers using local safety information , 1996 .

[21]  Olav Lysne,et al.  One-fault tolerance arid beyond in wormhole routed meshes , 1998, Microprocess. Microsystems.

[22]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[23]  Chita R. Das,et al.  Exploring Fault-Tolerant Network-on-Chip Architectures , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[24]  Antonio Robles,et al.  LASH-TOR: a generic transition-oriented routing algorithm , 2004, Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004..

[25]  John P. Hayes,et al.  A Fault-Tolerant Communication Scheme for Hypercube Computers , 1992, IEEE Trans. Computers.

[26]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[27]  Suresh Chalasani,et al.  Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks , 1994, Proceedings of Supercomputing '94.

[28]  Theodore R. Bashkow,et al.  A large scale, homogeneous, fully distributed parallel machine, I , 1977, ISCA '77.

[29]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes without virtual channels , 1996, IEEE Transactions on Parallel and Distributed Systems.