Optimistic distributed simulation based on transitive dependency tracking

In traditional optimistic distributed simulation protocols, a logical process (LP) receiving a straggler rolls back and sends out anti-messages. The receiver of an anti-message may also roll back and send out more anti-messages. So a single straggler may result in a large number of anti-messages and multiple rollbacks of some LPs. In the authors' protocol, an LP receiving a straggler broadcasts its rollback. On receiving this announcement, other LPs may roll back but they do not announce their rollbacks. So each LP rolls back at most once in response to each straggler. Anti-messages are not used. This eliminates the need for output queues and results in simple memory management. It also eliminates the problem of cascading rollbacks and echoing, and results in faster simulation. All this is achieved by a scheme for maintaining transitive dependency information. The cost incurred includes the tagging of each message with extra dependency information and the increased processing time upon receiving a message. They also present the similarities between the two areas of distributed simulation and distributed recovery. They show how the solutions for one area can be applied to the other area.

[1]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[2]  Alexander I. Tomlinson,et al.  An algorithm for minimally latent global virtual time , 1993, PADS '93.

[3]  James R. Russell,et al.  Optimistic failure recovery for very large networks , 1991, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems.

[4]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[5]  Carl Tropper,et al.  Clustered time warp and logic simulation , 1995, PADS.

[6]  Robert E. Strom,et al.  Optimistic recovery in distributed systems , 1985, TOCS.

[7]  Vijay K. Garg,et al.  Distributed recovery with K-optimistic logging , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[8]  K M Chandy,et al.  The Conditional-Event Approach to Distributed Simulation , 1989 .

[9]  Hassan Rajaei,et al.  The local Time Warp approach to parallel simulation , 1993, PADS '93.

[10]  Yi-Bing Lin,et al.  Memory Management Algorithms for Optimistic Parallel Simulation , 1994, Inf. Sci..

[11]  A. Weiss,et al.  Rollback sometimes works...if filtered , 1989, WSC '89.

[12]  Darrin West,et al.  Automatic incremental state saving , 1996, Workshop on Parallel and Distributed Simulation.

[13]  Philip A. Wilsey,et al.  Comparative analysis of periodic state saving techniques in time warp simulators , 1995, PADS.

[14]  Vijay K. Garg,et al.  How to recover efficiently and asynchronously when optimism fails , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[15]  Richard M. Fujimoto,et al.  Parallel discrete event simulation , 1990, CACM.

[16]  Karsten Schwan,et al.  Time Warp simulation in time constrained systems , 1993, PADS '93.