Hardware-Transactional-Memory Based Speculative Parallel Discrete Event Simulation of Very Fine Grain Models

This article presents an innovative runtime support for speculative parallel processing of discrete event simulation models on multi-core architectures, which exploits Hardware-Transactional-Memory (HTM) facilities for the purpose of state recoverability. In this proposal, the speculative updates on the state of the simulation model are executed as concurrent HTM-based transactions that are also in charge of detecting whether the update is consistent with the advancement of logical-time along model execution. Our proposal is fully transparent to the application code. Hence, our HTM-based run-time support can host conventionally developed discrete event models relying on the concept of event-handlers to be dispatched by an underlying simulation engine. Experimental data show that our proposal provides 75% to 92% of the ideal speedup on an Intel Haswell based platform (equipped with 4 physical cores and HTM support) for discrete event models with event granularity ranging between 2 and 12 microseconds. The data also show that these same models cannot be executed efficiently on top of a last generation parallel discrete event simulation platform employing software-based recoverability.

[1]  Roberto Baldoni,et al.  Exploiting Intra-Object Dependencies in Parallel Simulation , 1999, Inf. Process. Lett..

[2]  Paul F. Reynolds,et al.  Implementation of reductions in support of PDES on a network of workstations , 1998, Workshop on Parallel and Distributed Simulation.

[3]  Ron Brightwell,et al.  Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Roberto Vitali,et al.  Autonomic State Management for Optimistic Simulation Platforms , 2015, IEEE Transactions on Parallel and Distributed Systems.

[5]  George F. Riley,et al.  Hardware Supported Time Synchronization in Multi-core Architectures , 2009, 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation.

[6]  Johan Montagnat,et al.  Transparent incremental state saving in time warp parallel discrete event simulation , 1996, Workshop on Parallel and Distributed Simulation.

[7]  Francesco Quaglia A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation , 2001, IEEE Trans. Parallel Distributed Syst..

[8]  Richard M. Fujimoto,et al.  Exploiting temporal uncertainty in parallel and distributed simulations , 1999, Proceedings Thirteenth Workshop on Parallel and Distributed Simulation. PADS 99. (Cat. No.PR00155).

[9]  Roberto Vitali,et al.  Transparent and Efficient Shared-State Management for Optimistic Simulations on Multi-core Machines , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[10]  Philip A. Wilsey,et al.  Experiments with Hardware-based Transactional Memory in Parallel Simulation , 2015, SIGSIM-PADS.

[11]  Christopher D. Carothers,et al.  ROSS: a high-performance, low memory, modular time warp system , 2000, Proceedings Fourteenth Workshop on Parallel and Distributed Simulation.

[12]  Alessandro Pellegrini,et al.  Transparent multi-core speculative parallelization of DES models with event and cross-state dependencies , 2014, SIGSIM PADS '14.

[13]  Adel Said Elmaghraby,et al.  An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation , 1998, IEEE Trans. Parallel Distributed Syst..

[14]  Christopher D. Carothers,et al.  LORAIN: a step closer to the PDES 'holy grail' , 2014, SIGSIM PADS '14.

[15]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[16]  George F. Riley,et al.  A New Approach to Zero-Copy Message Passing with Reversible Memory Allocation in Multi-core Architectures , 2012, 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation.

[17]  Erez Petrank,et al.  Wait-free queues with multiple enqueuers and dequeuers , 2011, PPoPP '11.

[18]  Steven Bellenot,et al.  Performance of a riskfree Time Warp operating system , 1993, PADS '93.

[19]  Christopher D. Carothers,et al.  Warp speed: executing time warp on 1,966,080 cores , 2013, SIGSIM-PADS.

[20]  Francesco Quaglia,et al.  Nonblocking Checkpointing for Optimistic Parallel Simulation: Description and an Implementation , 2003, IEEE Trans. Parallel Distributed Syst..

[21]  Randy Brown,et al.  Calendar queues: a fast 0(1) priority queue implementation for the simulation event set problem , 1988, CACM.

[22]  Roberto Vitali,et al.  The ROme OpTimistic Simulator: core internals and programming model , 2011, SimuTools.

[23]  Sven Koenig,et al.  Terrain coverage with ant robots: a simulation study , 2001, AGENTS '01.

[24]  Nael B. Abu-Ghazaleh,et al.  Optimizing communication in time-warp simulators , 1998, Workshop on Parallel and Distributed Simulation.

[25]  Rajive L. Bagrodia,et al.  Simultaneous events and lookahead in simulation protocols , 2000, TOMC.

[26]  Danny Hendler,et al.  Scheduling support for transactional memory contention management , 2010, PPoPP '10.

[27]  Rizos Sakellariou,et al.  Improving lookahead in parallel discrete event simulations of large-scale applications using compiler analysis , 2001, Proceedings 15th Workshop on Parallel and Distributed Simulation.

[28]  Philip A. Wilsey,et al.  WARPED: a time warp simulation kernel for analysis and application development , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[29]  Christopher D. Carothers,et al.  Efficient optimistic parallel simulations using reverse computation , 1999, Workshop on Parallel and Distributed Simulation.

[30]  Paul F. Reynolds,et al.  Elastic time , 1998, TOMC.

[31]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[32]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[33]  Ganesh Gopalakrishnan,et al.  Design and Evaluation of the Rollback Chip: Special Purpose Hardware for Time Warp , 1992, IEEE Trans. Computers.

[34]  Asser N. Tantawi,et al.  Extreme scale computing: Modeling the impact of system noise in multicore clustered systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[35]  Philip A. Wilsey,et al.  Adaptive checkpoint intervals in an optimistically synchronised parallel digital system simulator , 1993, VLSI.

[36]  G. S. Graham A New Solution of Dijkstra ' s Concurrent Programming Problem , 2022 .

[37]  Wayne M. Loucks,et al.  Effects of the checkpoint interval on time and space in time warp , 1994, TOMC.