Fault tolerant algorithms for network-on-chip interconnect

As technology scales, fault tolerance is becoming a key concern in on-chip communication. Consequently, this work examines fault tolerant communication algorithms for use in the NoC domain. Two different flooding algorithms and a random walk algorithm are investigated. We show that the flood-based fault tolerant algorithms have an exceedingly high communication overhead. We find that the redundant random walk algorithm offers significantly reduced overhead while maintaining useful levels of fault tolerance. We then compare the implementation costs of these algorithms, both in terms of area as well as in energy consumption, and show that the flooding algorithms consume an order of magnitude more energy per message transmitted.

[1]  Tohru Kikuno,et al.  A synthesis method for fault-tolerant and flexible multipath routing protocols , 1997, Proceedings. Third IEEE International Conference on Engineering of Complex Computer Systems (Cat. No.97TB100168).

[2]  Joseph Y. Halpern,et al.  Gossip-based ad hoc routing , 2002, IEEE/ACM Transactions on Networking.

[3]  D. Vere-Jones Markov Chains , 1972, Nature.

[4]  Luca Benini,et al.  Networks on chip: a new paradigm for systems on chip design , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[5]  Arthur L. Liestman,et al.  A survey of gossiping and broadcasting in communication networks , 1988, Networks.

[6]  Pierre L'Ecuyer,et al.  A new class of linear feedback shift register generators , 2000, 2000 Winter Simulation Conference Proceedings (Cat. No.00CH37165).

[7]  William Stallings,et al.  Data and Computer Communications , 1985 .

[8]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[9]  Radu Marculescu,et al.  Towards on-chip fault-tolerant communication , 2003, ASP-DAC '03.

[10]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[11]  William J. Stewart,et al.  Introduction to the numerical solution of Markov Chains , 1994 .

[12]  David W. Krumme,et al.  Gossiping in Minimal Time , 1992, SIAM J. Comput..

[13]  Joseph Y. Halpern,et al.  Gossip-based ad hoc routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[14]  William J. Dally,et al.  Digital systems engineering , 1998 .