Performance evaluation of COWS under real parallel applications

Clusters of workstations (COWS) are often arranged as a switch-based network with irregular topology. Usually, the evaluation of interconnection networks for COWS has been carried out by simulation using synthetic traffic and by traces from real parallel applications. Although both types of traffics are used as a first approximation of the behavior of the system, a more accurate behavior can be obtained by using real parallel applications. In this paper, a new simulation framework has been developed in order to evaluate interconnection networks under real parallel applications by using an execution-driven simulator. Moreover, the new simulator can be used to evaluate the impact on the performance of the whole system of several design parameters in addition to the interconnection network. Evaluation results show that the execution time of real parallel applications can be reduced by using an effective routing algorithm. Moreover, in some cases, the achieved improvements are higher than the ones achieved by improving other design issues, such as the processor instruction issue rate, the cache size or the network bandwidth.

[1]  Cruz Izu,et al.  Low-level router design and its impact on supercomputer system performance , 1999, ICS '99.

[2]  James R. Goodman,et al.  The Impact of Pipelined Channels on k-ary n-Cube Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[3]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[4]  Antonio Robles,et al.  A Flexible Routing Scheme for Networks of Workstations , 2000, ISHPC.

[5]  Pedro López,et al.  Performance evaluation of a new routing strategy for irregular networks with source routing , 2000, ICS '00.

[6]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[7]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[8]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[9]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[10]  Stephen R. Goldschmidt,et al.  Simulation of multiprocessors: accuracy and performance , 1993 .

[11]  Stamatis Vassiliadis,et al.  Parallel Computer Architecture , 2000, Euro-Par.

[12]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[13]  Dhabaleswar K. Panda,et al.  Impact of adaptivity on the behavior of networks of workstations under bursty traffic , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[14]  Federico Silla,et al.  High-Performance Routing in Networks of Workstations with Irregular Topology , 2000, IEEE Trans. Parallel Distributed Syst..

[15]  Chita R. Das,et al.  Impact of virtual channels and adaptive routing on application performance , 2001, SIGCPR '01.

[16]  Rich Seifert Gigabit Ethernet , 2001, LCN.

[17]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[18]  Antonio Robles,et al.  A New Methodology to Computer Deadlock-Free Routing Tables for Irregular Networks , 2000, CANPC.

[19]  Antonio Robles,et al.  Improving the Up*/Down* Routing Scheme for Networks of Workstations , 2000, Euro-Par.

[20]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[21]  Pedro López,et al.  Performance Evaluation of Adaptive Routing Algorithms for k-ary-n-cubes , 1994, PCRCW.

[22]  Sarita V. Adve,et al.  RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors , 1997 .