Toward Grid-Aware Time Warp

We study the adaptation of an optimistic time warp kernel to cross-cluster computing on the grid. Wide area communication, the primary source of overhead, is off-loaded onto dedicated routing processes. This allows the simulation processes to run at full speed and it thus significantly decreases the performance gap caused by the wide area distribution. Further improvements are obtained by employing message aggregation on the wide area links. We achieve many of our objectives for lazy cancellation and moderate communication, but high communication rates with aggressive cancellation remains a challenge.

[1]  R.M. Fujimoto,et al.  Parallel and distributed simulation systems , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[2]  Toshiya Kimura,et al.  Local area metacomputing for multidisciplinary problems: a case study for fluid/structure coupled simulation , 1998, ICS '98.

[3]  Peter M. A. Sloot,et al.  Spatio-temporal correlations and rollback distributions in optimistic simulations , 2001, Proceedings 15th Workshop on Parallel and Distributed Simulation.

[4]  Peter M. A. Sloot,et al.  Time Warp cancellation optimizations on high latency networks , 2003, Proceedings Seventh IEEE International Symposium on Distributed Simulation and Real-Time Applications.

[5]  Matthias S. Müller,et al.  Grid enabled MPI solutions for clusters , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[6]  Nael B. Abu-Ghazaleh,et al.  Optimizing communication in time-warp simulators , 1998, Workshop on Parallel and Distributed Simulation.

[7]  Stephen John Turner,et al.  Performance analysis of packet bundling techniques in DIS , 1999, Proceedings 3rd IEEE International Workshop on Distributed Interactive Simulation and Real-Time Applications.

[8]  Herbert Bauer,et al.  Dynamic load balancing of a multi-cluster simulator on a network of workstations , 1995, PADS.

[9]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10]  Carl Tropper,et al.  On Process Migration and Load Balancing in Time Warp , 1993, IEEE Trans. Parallel Distributed Syst..

[11]  C. D. Pham Comparison of message aggregation strategies for parallel simulations on a high performance cluster , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[12]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[13]  Michael M. Resch,et al.  Implementing MPI with Optimized Algorithms for Metacomputing , 1999 .

[14]  Divyakant Agrawal,et al.  Replicated objects in time warp simulations , 1992, WSC '92.

[15]  Azzedine Boukerche,et al.  Distributed simulation over loosely coupled domains , 2000, Proceedings Fourth IEEE International Workshop on Distributed Simulation and Real-Time Applications (DS-RT 2000).

[16]  Christopher D. Carothers,et al.  Effect of communication overheads on Time Warp performance: an experimental study , 1994, PADS '94.

[17]  Peter M. A. Sloot,et al.  Self-organized criticality in simulated correlated systems , 2001 .

[18]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[19]  Alois Ferscha,et al.  Proceedings of the twelfth workshop on Parallel and distributed simulation , 1998 .

[20]  Bojan Groselj,et al.  Fault-tolerant distributed simulation , 1991, 1991 Winter Simulation Conference Proceedings..

[21]  Richard M. Fujimoto,et al.  Parallel and Distribution Simulation Systems , 1999 .

[22]  Simonetta Balsamo,et al.  Rollback overhead reduction methods for time warp distributed simulation , 1998, Simul. Pract. Theory.

[23]  B. J. Overeinder,et al.  Distributed Event-driven Simulation - Scheduling Strategies and Resource Management , 2000 .

[24]  Christopher D. Carothers,et al.  Efficient Execution of Time Warp Programs on Heterogeneous, NOW Platforms , 2000, IEEE Trans. Parallel Distributed Syst..

[25]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[26]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..