Transparent State Management for Optimistic Synchronization in the High Level Architecture

In this paper we present the design and implementation of a software architecture, namely Magic State Manager (MASM), to be employed within a run-time infrastructure (RTI) in support of HLA federations. MASM allows performing checkpointing/recovery of the state of a federate in a way completely transparent to the federate itself, thus providing the possibility of demanding to the RTI any task related to state management in optimistic synchronization. Differently from existing proposals, through our approach the federate programmer is neither required to supply modules for state management within the federate code, nor to explicitly interface the federate code with existing, third party checkpointing/recovery libraries. Hence, the federate programmer is completely relieved from the burden of facing state management issues. Some experimental results demonstrating minimal run-time overhead introduced by MASM are also reported for two case studies, namely an interconnection network simulation and a personal communication system simulation.

[1]  Jeff S. Steinman,et al.  Incremental State Saving in Speedes Using C++ , 1993, Proceedings of 1993 Winter Simulation Conference - (WSC '93).

[2]  Francesco Quaglia,et al.  Nonblocking Checkpointing for Optimistic Parallel Simulation: Description and an Implementation , 2003, IEEE Trans. Parallel Distributed Syst..

[3]  Fabian Gomes,et al.  State Saving for Interactive Optimistic Simulation , 1997, Workshop on Parallel and Distributed Simulation.

[4]  F. Vardanega,et al.  A generic rollback manager for optimistic HLA simulations , 2000, Proceedings Fourth IEEE International Workshop on Distributed Simulation and Real-Time Applications (DS-RT 2000).

[5]  Kai Li,et al.  Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.

[6]  Christopher D. Carothers,et al.  Distributed simulation of large-scale PCS networks , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[7]  Ian F. Akyildiz,et al.  A new random walk model for PCS networks , 2000, IEEE Journal on Selected Areas in Communications.

[8]  Sajal K. Das,et al.  Exploiting model independence for parallel PCS network simulation , 1999, Proceedings Thirteenth Workshop on Parallel and Distributed Simulation. PADS 99. (Cat. No.PR00155).

[9]  Adel Said Elmaghraby,et al.  An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation , 1998, IEEE Trans. Parallel Distributed Syst..

[10]  Stephen John Turner,et al.  Optimistic Synchronization in HLA-Based Distributed Simulation , 2005, Simul..

[11]  Christopher D. Carothers,et al.  A case study in simulating PCS networks using Time Warp , 1995, PADS.

[12]  IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA) — Framework and Rules , 2001 .

[13]  Richard M. Fujimoto,et al.  Predictable time management for real-time distributed simulation , 2003, Seventeenth Workshop on Parallel and Distributed Simulation, 2003. (PADS 2003). Proceedings..

[14]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[15]  Jeffrey F. Naughton,et al.  Low-Latency, Concurrent Checkpointing for Parallel Programs , 1994, IEEE Trans. Parallel Distributed Syst..

[16]  Wayne M. Loucks,et al.  Effects of the checkpoint interval on time and space in time warp , 1994, TOMC.

[17]  Stephen P. Boyd,et al.  Optimal power control in interference-limited fading wireless channels with outage-probability specifications , 2002, IEEE Trans. Wirel. Commun..

[18]  Rassul Ayani,et al.  Adaptive checkpointing in Time Warp , 1994, PADS '94.

[19]  Jeff S. Steinman Incremental state saving in SPEEDES using C++ , 1993, WSC '93.

[20]  Darrin West,et al.  Automatic incremental state saving , 1996, Workshop on Parallel and Distributed Simulation.

[21]  Andrew A. Chien,et al.  Planar-adaptive routing: low-cost adaptive networks for multiprocessors , 1992, ISCA '92.

[22]  Christopher D. Carothers,et al.  Efficient optimistic parallel simulations using reverse computation , 1999, Workshop on Parallel and Distributed Simulation.

[23]  Philip A. Wilsey,et al.  Comparative analysis of periodic state saving techniques in time warp simulators , 1995, PADS.

[24]  David Bruce The treatment of state in optimistic systems , 1995, PADS.

[25]  Johan Montagnat,et al.  Transparent incremental state saving in time warp parallel discrete event simulation , 1996, Workshop on Parallel and Distributed Simulation.

[26]  Francesco Quaglia A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation , 2001, IEEE Trans. Parallel Distributed Syst..