Network aware time management and event distribution

We discuss new synchronization algorithms for parallel and distributed discrete event simulations (PDES) which exploit the capabilities and behavior of the underlying communications network. Previous work in this area has assumed the network to be a black box which provides a one-to-one, reliable and in-order message passing paradigm. In our work, we utilize the broadcast capability of the ubiquitous Ethernet for synchronization computations, and both unreliable and reliable protocols for message passing, to achieve more efficient communications between the participating systems. We describe two new algorithms for computation of a distributed snapshot of global reduction operations on monotonically increasing values. The algorithms require O(N) messages (where N is the number of systems participating in the snapshot) in the normal case. We specifically target the use of this algorithm for distributed discrete event simulations to determine a global lower bound on time-stamp (LETS), but expect the algorithm has applicability outside the simulation community.