Scalable and Fast Simulation of Peer-to-Peer Systems Using SimGrid

Simulation is a common experimental methodology in distributed systems since it allows to easily and quickly test ideas. In some domains such as Peer-to-Peer (P2P) systems or Volunteer Computing, most of the studies rely on simulation. Despite several simulators now available, many researchers still choose to develop their own custom tool. This can certainly be explained by the apparent simplicity of doing so, but this task becomes tedious when trying to simulate quickly very large systems in a realistic way. In this paper we present the new architecture of the general-purpose simulation framework SimGrid, which provides significantly more realistic and flexible simulation capabilities than the aforementioned simulators. Our key contribution is a new implementation of the simulation core that enables the parallel execution of the user code during simulation, achieving faster and more scalable simulations. SimGrid now outperforms the reference simulator in the area in speed (one order of magnitude faster) and scalability (10 times bigger scenarios) while providing a better simulation accuracy. We discuss the key issues of implementing the parallelism, we analyze its trade-offs, and we give a criterion to understand in what kind of scenario a speedup can be expected. Finally we present several experiments to evaluate its performance and scalability in different domains, in particular a simulation instance of the Chord peer-to-peer protocol with two million nodes using a single computer.