Experiences with optimistic synchronization for distributed operating systems

Optimistic synchronization is a method of synchronizing parallel and distributed computations without the use of blocking. When non-optimistic systems would block, optimistic synchronization mechanisms permit operations to go ahead. If such optimism causes improper synchronization, the mis-synchronized work is undone and the entire system restored to a consistent state. This paper discusses the experiences of developing a distributed operating system based around optimistic synchronization, the Time Warp Operating System (TWOS). It covers the challenges of implementing such a system, the advantages of optimistic synchronization, and how well optimistic synchronization works in practice in TWOS, and offers advice for others developing systems using optimistic synchronization.

[1]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[2]  Miron Livny,et al.  Distributed Concurrency Control Performance: A Study of Algorithms, Distribution, and Replication , 1988, VLDB.

[3]  David R. Jefferson,et al.  Supercritical speedup , 1991, ANSS '91.

[4]  Bharat K. Bhargava,et al.  Performance Evaluation of the Optimistic Approach to Distributed Database Systems and Its Comparison to Locking , 1982, IEEE International Conference on Distributed Computing Systems.

[5]  Steven F. Bellenot State skipping performance with the time warp operating system , 1991 .

[6]  J. Steinman,et al.  SPEEDES: Synchronous Parallel Environment for Emulation and Discrete-Event Simulation , 1991 .

[7]  F. Wieland,et al.  Limitation of optimism in the time warp operating system , 1989, WSC '89.

[8]  David Jefferson,et al.  Fast Concurrent Simulation Using the Time Warp Mechanism. Part I. Local Control. , 1982 .

[9]  Jed Marti,et al.  Non-preemptive time warp scheduling algorithms , 1990, OPSR.

[10]  R. E. Strom Hermes: an integrated language and system for distributed programming , 1990, IEEE Workshop on Experimental Distributed Systems.

[11]  Peter Reiher,et al.  Providing determinism in the Time Warp operating system-costs, benefits, and implications , 1990, IEEE Workshop on Experimental Distributed Systems.

[12]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[13]  Wayne M. Loucks,et al.  On the Trade-off between Time and Space in Optimistic Parallel Discrete-Event Simulation , 1992 .

[14]  David R. Jefferson,et al.  Dynamic load management in the time warp operating system , 1990 .

[15]  Ganesh Gopalakrishnan,et al.  Design and performance of special purpose hardware for time warp , 1988, ISCA '88.

[16]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[17]  Phil Hontalas,et al.  Distributed Simulation and the Time Wrap Operating System. , 1987, SOSP 1987.

[18]  Richard M. Fujimoto,et al.  Time Warp on a Shared Memory Multiprocessor , 1989, ICPP.

[19]  Amihai Motro,et al.  The Time Warp mechanism for database concurrency control , 1986, 1986 IEEE Second International Conference on Data Engineering.