Synchronisation for dynamic load balancing of decentralised conservative distributed simulation

Synchronisation mechanisms are essential in distributed simulation. Some systems rely on central units to control the simulation but central units are known to be bottlenecks. If we want to avoid using a central unit to optimise the simulation speed, we lose the capacity to act on the simulation at a global scale. Being able to act on the entire simulation is an important feature which allows to dynamically load-balance a distributed simulation. While some local partitioning algorithms exist, their lack of global view reduces their efficiency. Running a global partitioning algorithm without central unit requires a synchronisation of all logical processes (LPs) at the same step. The first algorithm requires the knowledge of some topological properties of the network while the second algorithm works without any requirement. The algorithms are detailed and compared against each other. An evaluation shows the benefits of using a global dynamic load-balancing for distributed simulations.

[1]  J. D. Teresco,et al.  New challanges in dynamic load balancing , 2005 .

[2]  Joseph E. Gonzalez,et al.  GraphLab: A New Parallel Framework for Machine Learning , 2010 .

[3]  David L. Mills,et al.  Internet time synchronization: the network time protocol , 1991, IEEE Trans. Commun..

[4]  Bruce Hendrickson,et al.  Dynamic load balancing in computational mechanics , 2000 .

[5]  Richard C. Waters,et al.  Locales and beacons: efficient and precise support for large multi-user virtual environments , 1996, Proceedings of the IEEE 1996 Virtual Reality Annual International Symposium.

[6]  Katherine A. Yelick,et al.  Languages for High-Productivity Computing: the DARPA HPCS Language Project , 2007, Parallel Process. Lett..

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Liam Murphy,et al.  SParTSim: A Space Partitioning Guided by Road Network for Distributed Traffic Simulations , 2012, 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications.

[9]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[10]  Xueqi Cheng,et al.  Micro-Synchronization in Conservative Parallel Network Simulation , 2008, 2008 22nd Workshop on Principles of Advanced and Distributed Simulation.

[11]  Michael Zyda,et al.  Exploiting reality with multicast groups: a network architecture for large-scale virtual environments , 1995, Proceedings Virtual Reality Annual International Symposium '95.

[12]  Satish K. Tripathi,et al.  Parallel and distributed simulation of discrete event systems , 1994 .

[13]  Alois Ferscha,et al.  Parallel and Distributed Simulation , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[14]  Márk Jelasity,et al.  PeerSim: A scalable P2P simulator , 2009, 2009 IEEE Ninth International Conference on Peer-to-Peer Computing.

[15]  Graham F. Carey,et al.  Performance analysis of dynamic load balancing algorithms with variable number of processors , 2005, J. Parallel Distributed Comput..

[16]  Anthony Steed,et al.  Partitioning crowded virtual environments , 2003, VRST '03.

[17]  I. Couzin,et al.  Effective leadership and decision-making in animal groups on the move , 2005, Nature.

[18]  Jason Liu,et al.  Hierarchical Composite Synchronization , 2012, 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation.

[19]  Christopher D. Carothers,et al.  Scalable Time Warp on Blue Gene Supercomputers , 2009, 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation.

[20]  Lei Wang,et al.  Modeling Dynamic Load Balancing in Molecular Dynamics to Achieve Scalable Parallel Execution , 1998, IRREGULAR.

[21]  Stephen John Turner,et al.  Optimistic Synchronization in HLA-Based Distributed Simulation , 2005, Simul..

[22]  Courtenay T. Vaughan,et al.  Parallel Transient Dynamics Simulations: Algorithms for Contact Detection and Smoothed Particle Hydrodynamics , 1998, J. Parallel Distributed Comput..

[23]  Robin Wilson,et al.  Modern Graph Theory , 2013 .

[24]  Tao Zou,et al.  Making time-stepped applications tick in the cloud , 2011, SoCC.

[25]  Benjamin S. Kirk,et al.  Library for Parallel Adaptive Mesh Refinement / Coarsening Simulations , 2006 .

[26]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[27]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[28]  K. Arvind,et al.  Probabilistic Clock Synchronization in Distributed Systems , 1994, IEEE Trans. Parallel Distributed Syst..

[29]  Karen D. Devinea,et al.  New Challenges in Dynamic Load Balancing , 2004 .

[30]  Bingsheng He,et al.  Large graph processing in the cloud , 2010, SIGMOD Conference.

[31]  Yan Xu,et al.  An Offline Road Network Partitioning Solution in Distributed Transportation Simulation , 2012, 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications.

[32]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[33]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[34]  Parameswaran Ramanathan,et al.  Fault-tolerant clock synchronization in distributed systems , 1990, Computer.

[35]  L. F. Perrone,et al.  PARALLEL AND DISTRIBUTED SIMULATION : TRADITIONAL TECHNIQUES AND RECENT ADVANCES , 2006 .

[36]  Anthony Ventresque,et al.  dSUMO: towards a distributed SUMO , 2013 .