Load-Aware Dynamic Time Synchronization in Parallel Discrete Event Simulation

Traditional Parallel Discrete Event Simulation (PDES) systems employ a monolithic approach for choosing their thread synchronization protocol. They either implement a Time Window-based conservative synchronization or an optimistic event processing capability based on the Time Warp synchronization. In this paper, we show that this binary choice is suboptimal and unnecessary, particularly in the realistic situation where the load distribution across the simulation domain changes over time. We thus propose a new PDES synchronization scheme, called Hybrid PDES, that dynamically switches between conservative and optimistic synchronization protocols based on the simulation run time characteristics. The primary objective of Hybrid PDES is to exploit the optimistic event processing as long as it is beneficial for the system performance and scalability. We implement Hybrid PDES in Python- and Lua-based Simian PDES engines and demonstrate up to 3X performance improvements on Intel Knights Landing and AMD EPYC processors based on the Phold, La-pdes and PPT-GPU simulation applications.

[1]  Kenneth Chiu,et al.  Demand-Driven PDES: Exploiting Locality in Simulation Models , 2020, SIGSIM-PADS.

[2]  Gopinath Chennupati,et al.  PPT-GPU: Scalable GPU Performance Modeling , 2019, IEEE Computer Architecture Letters.

[3]  Stephan Eidenbenz,et al.  SessionSim: Activity-based session generation for network simulation , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).

[4]  Christopher D. Carothers,et al.  Warp speed: executing time warp on 1,966,080 cores , 2013, SIGSIM-PADS.

[5]  Nael B. Abu-Ghazaleh,et al.  Performance Characterization of Parallel Discrete Event Simulation on Knights Landing Processor , 2017, SIGSIM-PADS.

[6]  Boleslaw K. Szymanski,et al.  Dynamic load balancing in parallel discrete event simulation for spatially explicit problems , 1998, Workshop on Parallel and Distributed Simulation.

[7]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[8]  Roberto Vitali,et al.  Reshuffling PDES platforms for multi/many-core machines: A perspective with focus on load sharing , 2014, Modeling and Simulation-Based Systems Engineering Handbook.

[9]  David R. Jefferson,et al.  Virtual time III: Unification of conservative and optimistic synchronization in parallel discrete event simulation , 2017, 2017 Winter Simulation Conference (WSC).

[10]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[11]  Pavol Bauer,et al.  Exposing Inter-process Information for Efficient PDES of Spatial Stochastic Systems on Multicores , 2019, ACM Trans. Model. Comput. Simul..

[12]  Dmitry V. Ponomarev,et al.  Controlled Asynchronous GVT: Accelerating Parallel Discrete Event Simulation on Many-Core Clusters , 2019, ICPP.

[13]  Richard M. Fujimoto,et al.  Time Warp on a Shared Memory Multiprocessor , 1989, ICPP.

[14]  Alessandro Pellegrini,et al.  Hardware-Assisted Incremental Checkpointing in Speculative Parallel Discrete Event Simulation , 2019, 2019 Winter Simulation Conference (WSC).

[15]  Nael B. Abu-Ghazaleh,et al.  Parallel Discrete Event Simulation for Multi-Core Systems: Analysis and Optimization , 2014, IEEE Transactions on Parallel and Distributed Systems.

[16]  Alessandro Pellegrini,et al.  The Ultimate Share-Everything PDES System , 2018, SIGSIM-PADS.

[17]  Ricardo Parizotto,et al.  Closing the Gap Between Lookahead and Checkpointing to Provide Hybrid Synchronization , 2020 .

[18]  Sally Floyd,et al.  ns-3 project goals , 2006 .

[19]  Stephan Eidenbenz,et al.  Explicit Spatial Scattering for Load Balancing in Conservatively Synchronized Parallel Discrete Event Simulations , 2010, 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation.

[20]  Wei Shen,et al.  Experiments in load migration and dynamic load balancing in SPEEDES , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).

[21]  Dhananjai Madhava Rao Accelerating parallel agent-based epidemiological simulations , 2014, SIGSIM PADS '14.

[22]  Christof Teuscher,et al.  ActivitySim: large-scale agent-based activity generation for infrastructure simulation , 2009, SpringSim '09.

[23]  Nael B. Abu-Ghazaleh,et al.  Characterizing and Understanding PDES Behavior on Tilera Architecture , 2012, 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation.

[24]  Christopher D. Carothers,et al.  Scalable Time Warp on Blue Gene Supercomputers , 2009, 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation.

[25]  Phil Hontalas,et al.  Distributed Simulation and the Time Wrap Operating System. , 1987, SOSP 1987.

[26]  Wentong Cai,et al.  Fast-Forwarding Agent States to Accelerate Microscopic Traffic Simulations , 2018, SIGSIM-PADS.

[27]  Nael B. Abu-Ghazaleh,et al.  Performance Implications of Global Virtual Time Algorithms on a Knights Landing Processor , 2018, 2018 IEEE/ACM 22nd International Symposium on Distributed Simulation and Real Time Applications (DS-RT).

[28]  Paul F. Reynolds,et al.  Elastic time , 1998, TOMC.

[29]  Nandakishore Santhi,et al.  The Simian concept: Parallel Discrete Event Simulation with interpreted languages and just-in-time compilation , 2015, 2015 Winter Simulation Conference (WSC).

[30]  Yen-Chen Liu,et al.  Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.

[31]  Boris D. Lubachevsky,et al.  Efficient distributed event driven simulations of multiple-loop networks , 1988, SIGMETRICS '88.

[32]  David M. Nicol,et al.  Analysis of bounded time warp and comparison with YAWNS , 1996, TOMC.

[33]  Comparing Program Phase Detection Techniques , 2003, MICRO.

[34]  Gopinath Chennupati,et al.  Scalable Performance Prediction of Codes with Memory Hierarchy and Pipelines , 2019, SIGSIM-PADS.

[35]  Nael B. Abu-Ghazaleh,et al.  Optimization of Parallel Discrete Event Simulator for Multi-core Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[36]  Carl Tropper,et al.  A Design-Driven Partitioning Algorithm for Distributed Verilog Simulation , 2007, 21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07).

[37]  Stephan Eidenbenz,et al.  Designing systems for large-scale, discrete-event simulations: Experiences with the FastTrans parallel microsimulator , 2009, 2009 International Conference on High Performance Computing (HiPC).

[38]  Sudip K. Seal,et al.  Discrete event modeling and massively parallel execution of epidemic outbreak phenomena , 2012, Simul..

[39]  Laxmikant V. Kalé,et al.  Adaptive Methods for Irregular Parallel Discrete Event Simulation Workloads , 2018, SIGSIM-PADS.

[40]  Nandakishore Santhi,et al.  Parameterized benchmarking of parallel discrete event simulation systems: Communication, computation, and memory , 2015, 2015 Winter Simulation Conference (WSC).

[41]  Vinod Tipparaju,et al.  Discrete Event Execution with One-Sided and Two-Sided GVT Algorithms on 216,000 Processor Cores , 2014, TOMC.

[42]  Lifan Xu,et al.  Auto-tuning a high-level language targeted to GPU codes , 2012, 2012 Innovative Parallel Computing (InPar).