Local versus global lookahead in conservative parallel simulations

Abstract This paper presents an algorithm, which we refer to as SGTNE, to efficiently obtain lookahead information from a cluster of processors in a parallel simulation in order to unblock (logical) processes (LP) in a given processor. The SGTNE algorithm is based on a TNE conservative synchronization scheme that relies on an independent execution of a shortest path algorithm in individual processors in order to provide lookahead to the resident LPs. Because TNE is executed on individual processors, it is susceptible to inter-processor deadlocks, which must be detected and broken at some cost. SGTNE (Semi-Global TNE) avoids these deadlocks by executing a shortest path algorithm over a snapshot of the LPs in a cluster of processors. An experimental study of SGTNE was conducted on an Intel Paragon A4. The study compared SGTNE to TNE and to an optimized version of Chandy–Misra (CM) null message algorithms. We also investigated several scheduling algorithms for SGTNE and determined factors influencing its performance, most notably the influence of partitioning. Our results indicate that SGTNE provides good speedup relative to the fastest sequential algorithm and that it out-performs TNE for the population level examined, SGTNE was 3–5 times as fast as the CM-algorithm.

[1]  Azzedine Boukerche,et al.  Parallel simulation of communicating finite state machines , 1993, PADS '93.

[2]  Boris D. Lubachevsky,et al.  Efficient distributed event-driven simulations of multiple-loop networks , 1988, CACM.

[3]  Roger W. Hockney,et al.  The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.

[4]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[5]  Azzedine Boukerche,et al.  A Distributed Graph Algorithm for the Detection of Local Cycles and Knots , 1998, IEEE Trans. Parallel Distributed Syst..

[6]  Douglas W. Jones,et al.  An empirical comparison of priority-queue and event-set implementations , 1986, CACM.

[7]  Charles L. Seitz,et al.  Variants of the Chandy-Misra-Bryant Distributed Discrete-Event Simulation Algorithm , 1988 .

[8]  Richard M. Fujimoto,et al.  Parallel discrete event simulation , 1990, CACM.

[9]  Samir R. Das,et al.  A performance study of the cancelback protocol for Time Warp , 1993, PADS '93.

[10]  Richard M. Fujimoto,et al.  Multicomputer Networks: Message-Based Parallel Processing , 1987 .

[11]  K. Mani Chandy,et al.  Distributed computation on graphs: shortest path algorithms , 1982, CACM.

[12]  Edward D. Lazowska,et al.  Conservative parallel simulation for systems with no lookahead prediction , 1990 .

[13]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[14]  Jean C. Walrand,et al.  Wolf: a rollback algorithm for optimistic distributed simulation systems , 1988, WSC '88.

[15]  Paul F. Reynolds,et al.  Disseminating critical target-specific synchronization information in parallel discrete event simulations , 1993, PADS '93.

[16]  Hassan Rajaei,et al.  Parallel simulation using conservative time windows , 1992, WSC '92.

[17]  Azzedine Boukerche,et al.  A static partitioning and mapping algorithm for conservative parallel simulations , 1994, PADS '94.

[18]  K M Chandy,et al.  The Conditional-Event Approach to Distributed Simulation , 1989 .

[19]  Joel H. Saltz,et al.  Reduction of the effects of the communication delays in scientific algorithms on message passing MIMD architectures , 1985, PPSC.

[20]  David M. Nicol Parallel discrete-event simulation of FCFS stochastic queueing networks , 1988, PPoPP 1988.

[21]  Sajal K. Das,et al.  Dynamic load balancing strategies for conservative parallel simulations , 1997 .

[22]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[23]  Jayadev Misra,et al.  Distributed discrete-event simulation , 1986, CSUR.

[24]  Daniel P. Siewiorek,et al.  The influence of parallel decomposition strategies on the performance of multiprocessor systems , 1985, ISCA '85.