Design Issues for Optimistic Distributed Discrete Event Simulation

Simulation is a powerful tool for studying the dynamics of a system. However, simulation is time-consuming. Thus, it is natural to attempt to use multiple processors to speed up the simulation process. Many protocols have been proposed to perform discrete event simulation in multi-processor environments. Most of these distributed discrete event simulation protocols are either conservative or optimistic. The most common optimistic distributed simulation protocol is called Time Warp. Several issues must be considered when designing a Time Warp simulation; examples are reducing the state saving overhead and designing the global control mechanism (i. e., global virtual time computation, memory management, distributed termination, and fault tolerance). This paper addresses these issues. We propose a heuristic to select the checkpoint interval to reduce the state saving overhead, generalize a previously proposed global virtual time computation algorithm, and present new algorithms for memory management, distributed termination, and fault tolerance. The main contribution of this paper is to provide guidelines for designing an efficient Time Warp simulation.

[1]  Yi-Bing Lin,et al.  Optimality considerations of 'Time Warp' parallel simulation , 1990 .

[2]  Rhonda Righter,et al.  Distributed simulation of discrete event systems , 1989, Proc. IEEE.

[3]  Divyakant Agrawal,et al.  Recovering from process failures in the time warp mechanism , 1989, Proceedings of the Eighth Symposium on Reliable Distributed Systems.

[4]  Paul F. Reynolds Heterogenous distributed simulation , 1988, WSC '88.

[5]  Yi-Bing Lin,et al.  A time-division algorithm for parallel simulation , 1991, TOMC.

[6]  Ganesh Gopalakrishnan,et al.  Design and performance of special purpose hardware for time warp , 1988, ISCA '88.

[7]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[8]  John W. Young,et al.  A first order approximation to the optimum checkpoint interval , 1974, CACM.

[9]  Yi-Bing Lin,et al.  Determining the Global Virtual Time in a Distributed Simulation , 1990, ICPP.

[10]  Behrokh Samadi Distributed simulation, algorithms and performance analysis (load balancing, distributed processing) , 1985 .

[11]  Richard M. Fujimoto,et al.  Time Warp on a Shared Memory Multiprocessor , 1989, ICPP.

[12]  N. S. Barnett,et al.  Private communication , 1969 .

[13]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[14]  A. J. M. van Gasteren,et al.  Derivation of a Termination Detection Algorithm for Distributed Computations , 1983, Inf. Process. Lett..

[15]  Y.-B. Lin,et al.  Exploiting Lookahead in Parallel Simulation , 1990, IEEE Trans. Parallel Distributed Syst..

[16]  Erol Gelenbe,et al.  On the Optimum Checkpoint Interval , 1979, JACM.

[17]  Phil Hontalas,et al.  Distributed Simulation and the Time Wrap Operating System. , 1987, SOSP 1987.

[18]  Ten-Hwang Lai,et al.  Termination Detection for Dynamically Distributed Systems with Non-first-in-first-out Communication , 1986, J. Parallel Distributed Comput..

[19]  Devendra Kumar,et al.  A Class of Termination Detection Algorithms For Distributed Computation , 1985, FSTTCS.

[20]  Wayne M. Loucks,et al.  On the Trade-off between Time and Space in Optimistic Parallel Discrete-Event Simulation , 1992 .

[21]  Yi-Bing Lin,et al.  A study of time warp rollback mechanisms , 1991, TOMC.

[22]  Fred J. Kaudel,et al.  A literature survey on distributed discrete event simulation , 1987, SIML.

[23]  Rodney W. Topor,et al.  Termination Detection for Distributed Computations , 1984, Inf. Process. Lett..

[24]  Steven F. Bellenot State skipping performance with the time warp operating system , 1991 .

[25]  Michael Rodeh,et al.  Achieving Distributed Termination without Freezing , 1982, IEEE Transactions on Software Engineering.

[26]  Jayadev Misra,et al.  Distributed discrete-event simulation , 1986, CSUR.

[27]  Edward D. Lazowska,et al.  Conservative parallel simulation for systems with no lookahead prediction , 1990 .

[28]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.