Communication Delay in Wormhole-Switched Tori Networks under Bursty Workloads

Workloads generated by the real-world parallel applications that are executed on a multicomputer have a strong effect on the performance of its interconnection network—the hardware fabric supporting communication among individual processors. Existing multicomputer networks have been primarily designed and analysed under the assumption that the workload follows the non-bursty Poisson arrival process. As a step towards obtaining a clear understanding of network performance under various workloads, this paper presents a new analytical model for computing message latency in wormhole switched torus networks in the presence of bursty traffic, based on the well-known Markov-Modulated Poisson Process (MMPP). In order to derive the model, the approach for accurately capturing the properties of the composite MMPPs is applied to characterize traffic on network channels. Moreover, a general method has been proposed for calculating the probability of virtual channel occupancy when the traffic on network channels follows a multi-state MMPP process. Simulation experiments reveal that the model exhibits a good degree of accuracy.

[1]  Pedro Cuenca,et al.  Parallelization of the MPEG coding algorithm over a multicomputer. A proposal to evaluate its interconnection network , 1997, 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997.

[2]  Wolfgang Fischer,et al.  The Markov-Modulated Poisson Process (MMPP) Cookbook , 1993, Perform. Evaluation.

[3]  David M. Lucantoni,et al.  A Markov Modulated Characterization of Packetized Voice and Data Traffic and Related Statistical Multiplexer Performance , 1986, IEEE J. Sel. Areas Commun..

[4]  Chuan Wang,et al.  Loss performance analysis of an ATM multiplexer , 1992, [Proceedings] Singapore ICCS/ISITA `92.

[5]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[6]  Mohamed Ould-Khaoua,et al.  Performance analysis of wormhole switching in k-ary n-cubes under multimedia traffic , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[7]  William J. Dally,et al.  Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[8]  Dan Keun Sung,et al.  An empirical real-time approximation of waiting time distribution in MMPP(2)/D/1 , 1998, IEEE Communications Letters.

[9]  William J. Dally Virtual-channel flow control , 1990, ISCA '90.

[10]  Anant Agarwal,et al.  Limits on Interconnection Network Performance , 1991, IEEE Trans. Parallel Distributed Syst..

[11]  T. Gross,et al.  !Warp-anatomy of a parallel computing system , 1999, IEEE Concurrency.

[12]  R. E. Kessler,et al.  Cray T3D: a new dimension for Cray Research , 1993, Digest of Papers. Compcon Spring.

[13]  Dan Keun Sung,et al.  A CAC scheme based on real-time cell loss estimation for ATM multiplexers , 2000, IEEE Trans. Commun..

[14]  Chita R. Das,et al.  Hypercube Communication Delay with Wormhole Routing , 1994, IEEE Trans. Computers.

[15]  John A. Silvester,et al.  An Approximate Model for Performance Evaluation of Real-Time Multimedia Communication Systems , 1995, Perform. Evaluation.

[16]  Marco Listanti,et al.  Loss Performance Analysis of an ATM Multiplexer Loaded with High-Speed ON-OFF Sources , 1991, IEEE J. Sel. Areas Commun..

[17]  Mohamed Ould-Khaoua,et al.  Performance Analysis of Wormhole-Switched k-Ary n-Cubes with Bursty Traffic , 2001 .

[18]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..

[19]  Sudhakar Yalamanchili,et al.  MMR: a high-performance MultiMedia Router-architecture and design trade-offs , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[20]  William J. Dally,et al.  The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers , 1994, PCRCW.

[21]  Ron Buck nCUBE Corporation: The Oracle Media Server for nCube Massively Parallel Systems , 1994 .

[22]  S. Konstantinidou,et al.  Chaos router: architecture and performance , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[23]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[24]  Mohamed Ould-Khaoua,et al.  A Comparative Study of Switching Methods in Multicomputer Networks , 2004, The Journal of Supercomputing.

[25]  Steve Scott,et al.  Performance of the CRAY T3E Multiprocessor , 1997, SC.

[26]  Mohamed Ould-Khaoua,et al.  Performance Modelling of Pipelined Circuit Switching Under MMPP Traffic , 2001, J. Interconnect. Networks.

[27]  Kamel Barkaoui,et al.  Performance analysis of an N/spl times/N ATM switch with Markov modulated Poisson process under back-pressure mechanism , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[28]  Mohamed Ould-Khaoua,et al.  A Performance Model for Duato's Fully Adaptive Routing Algorithm in k-Ary n-Cubes , 1999, IEEE Trans. Computers.

[29]  Suresh Rai,et al.  Analysing packetized voice and video traffic in an ATM multiplexer , 1998 .