Improving a Fault-Tolerant Routing Algorithm Using Detailed Traffic Analysis

Currently, some coarse measures like global network latency are used to compare routing protocols. These measures do not provide enough insight of traffic distribution among network nodes in presence of different fault regions. This paper presents a detailed traffic analysis of f-cube routing algorithm achieved by a especially developed tool. Per-node traffic analysis illustrates the traffic hotspots caused by fault regions and provides a great assistance in developing fault tolerant routing algorithms. Based on such detailed information, a simple yet effective improvement of f-cube is suggested. Moreover, the effect of a traffic hotspot on the traffic of neighboring nodes and global performance degradation is investigated. To analyze the per-node traffic, some per-node traffic metrics are introduced and one of them is selected for the rest of work. In an effort to gain deep understanding of the issue of traffic analysis of faulty networks, this paper is the first attempt to investigate per-node traffic around fault regions.

[1]  Jipeng Zhou,et al.  Fault-tolerant wormhole routing in 2D meshes , 2000, Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN 2000.

[2]  Ming-Jer Tsai,et al.  Fault-Tolerant Routing in Wormhole Meshes , 2003, J. Interconnect. Networks.

[3]  Jie Wu,et al.  A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model , 2003, IEEE Trans. Computers.

[4]  Sheng-De Wang,et al.  An Improved Algorithm for Fault-Tolerant Routing in Hypercubes , 1997, IEEE Trans. Computers.

[5]  Jie Wu Fault-Tolerant Adaptive and Minimal Routing in Mesh-Connected Multicomputers Using Extended Safety Levels , 2000, IEEE Trans. Parallel Distributed Syst..

[6]  Kresimir Mihic,et al.  Reliability and power management of integrated systems , 2004 .

[7]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[8]  Jipeng Zhou,et al.  Fault-Tolerant Wormhole Routing Algorithm in 2D Meshes Without Virtual Channels , 2004, ISPA.

[9]  Viktor K. Prasanna,et al.  High Performance Computing - HiPC 2004 , 2004, Lecture Notes in Computer Science.

[10]  Vara Varavithya,et al.  An EfficientFault-Tolerant Routing Scheme for Two Dimensional Meshes , 1995 .

[11]  T. Dumitras,et al.  Towards on-chip fault-tolerant communication , 2003, Proceedings of the ASP-DAC Asia and South Pacific Design Automation Conference, 2003..

[12]  Mahmut T. Kandemir,et al.  Fault tolerant algorithms for network-on-chip interconnect , 2004, IEEE Computer Society Annual Symposium on VLSI.

[13]  Luca Benini,et al.  Error control schemes for on-chip communication links: the energy-reliability tradeoff , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  Antonio Robles,et al.  A New Adaptive Fault-Tolerant Routing Methodology for Direct Networks , 2004, HiPC.

[15]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[16]  Hui Gao,et al.  Parallel and Distributed Processing and Applications , 2005 .

[17]  Hamid Sarbazi-Azad,et al.  XMulator: A Listener-Based Integrated Simulation Platform for Interconnection Networks , 2007, First Asia International Conference on Modelling & Simulation (AMS'07).

[18]  Larry J. Stockmeyer,et al.  A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers , 2004, Proceedings 16th International Parallel and Distributed Processing Symposium.

[19]  Cruz Izu Throughput fairness in k-ary n-cube networks , 2006, ACSC.

[20]  Francis C. M. Lau,et al.  Fault-Tolerant Multicast Wormhole Routing in 2D Meshes , 2000 .

[21]  Daniel Wiklund Development and performance evaluation of networks on chip , 2005 .

[22]  Jie Wu,et al.  A simple fault-tolerant adaptive and minimal routing approach in 3-D meshes , 2008, Journal of Computer Science and Technology.

[23]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[24]  Johnny Öberg,et al.  Reducing power and latency in 2-D mesh NoCs using globally pseudochronous locally synchronous clocking , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[25]  Jie Wu,et al.  Fault-tolerant and deadlock-free routing in 2-D meshes using rectilinear-monotone polygonal fault blocks , 2005, Parallel Algorithms Appl..

[26]  Suresh Chalasani,et al.  Communication in Multicomputers with Nonconvex Faults , 1997, IEEE Trans. Computers.