Toward Grid-Aware Time Warp

The authors study the adaptation of an optimistic Time Warp kernel to cross-cluster computing on the Grid. Wide-area communication, the primary source of overhead, is offloaded onto dedicated routing processes. This allows the simulation processes to run at full speed and thus significantly decreases the performance gap caused by the wide-area distribution. Further improvements are obtained by employing message aggregation on the wide-area links and using a distributed global virtual time algorithm. The authors achieve many of their objectives for a cellular automaton simulation with lazy cancellation and moderate communication. High communication rates, especially with aggressive cancellation, present a challenge. This is confirmed by the experiments with synthetic loads. Even then, a satisfactory speedup can be achieved, provided that the computational grain of events is large enough.

[1]  Christopher D. Carothers,et al.  Effect of communication overheads on Time Warp performance: an experimental study , 1994, PADS '94.

[2]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[3]  Nael B. Abu-Ghazaleh,et al.  Optimizing communication in time-warp simulators , 1998, Workshop on Parallel and Distributed Simulation.

[4]  Peter M. A. Sloot,et al.  Self-organized criticality in simulated correlated systems , 2001 .

[5]  Carl Tropper,et al.  On Process Migration and Load Balancing in Time Warp , 1993, IEEE Trans. Parallel Distributed Syst..

[6]  Toshiya Kimura,et al.  Local area metacomputing for multidisciplinary problems: a case study for fluid/structure coupled simulation , 1998, ICS '98.

[7]  C. D. Pham Comparison of message aggregation strategies for parallel simulations on a high performance cluster , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[8]  Divyakant Agrawal,et al.  Replicated objects in time warp simulations , 1992, WSC '92.

[9]  Herbert Bauer,et al.  Dynamic load balancing of a multi-cluster simulator on a network of workstations , 1995, PADS.

[10]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[11]  R.M. Fujimoto,et al.  Parallel and distributed simulation systems , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[12]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[13]  Peter M. A. Sloot,et al.  Spatio-temporal correlations and rollback distributions in optimistic simulations , 2001, Proceedings 15th Workshop on Parallel and Distributed Simulation.

[14]  Peter M. A. Sloot,et al.  Time Warp cancellation optimizations on high latency networks , 2003, Proceedings Seventh IEEE International Symposium on Distributed Simulation and Real-Time Applications.

[15]  Matthias S. Müller,et al.  Grid enabled MPI solutions for clusters , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[16]  Vijay K. Garg,et al.  Fault-tolerant distributed simulation , 1998, Workshop on Parallel and Distributed Simulation.

[17]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[18]  Richard M. Fujimoto,et al.  Parallel and Distribution Simulation Systems , 1999 .

[19]  Simonetta Balsamo,et al.  Rollback overhead reduction methods for time warp distributed simulation , 1998, Simul. Pract. Theory.

[20]  B. J. Overeinder,et al.  Distributed Event-driven Simulation - Scheduling Strategies and Resource Management , 2000 .

[21]  Azzedine Boukerche,et al.  Distributed simulation over loosely coupled domains , 2000, Proceedings Fourth IEEE International Workshop on Distributed Simulation and Real-Time Applications (DS-RT 2000).

[22]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[23]  Michael M. Resch,et al.  Implementing MPI with Optimized Algorithms for Metacomputing , 1999 .

[24]  Christopher D. Carothers,et al.  Efficient Execution of Time Warp Programs on Heterogeneous, NOW Platforms , 2000, IEEE Trans. Parallel Distributed Syst..

[25]  Stephen John Turner,et al.  Performance analysis of packet bundling techniques in DIS , 1999, Proceedings 3rd IEEE International Workshop on Distributed Interactive Simulation and Real-Time Applications.