Performance prediction of large-scale parallel discrete event models of physical systems

A virtualization system is presented that is designed to help predict the performance of parallel/distributed discrete event simulations on massively parallel (supercomputing) platforms. It is intended to be useful in experimenting with and understanding the effects of execution parameters, such as different load balancing schemes and mixtures of model fidelity. A case study of the virtualization system is presented in the context of plasma physics simulations, highlighting important virtualization challenges and issues, such as reentrancy and synchronization in the virtual plane, and our corresponding solution approaches. A trace-based prediction methodology is presented, and is evaluated with a 1-D hybrid collisionless shock model simulation, with the predicted performance being validated against one obtained in actual simulation. Predicted performance measurements show excellent agreement with actual performance measurements on parallel platforms containing up to 512 CPUs.

[1]  Rajive L. Bagrodia,et al.  MPI-SIM: using parallel simulation to evaluate MPI programs , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).

[2]  Laxmikant V. Kalé,et al.  BigSim: a parallel simulator for performance prediction of extremely large parallel machines , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  Laxmikant V. Kalé,et al.  Performance prediction using simulation of large-scale interconnection networks in POSE , 2005, Workshop on Principles of Advanced and Distributed Simulation (PADS'05).

[4]  Richard M. Fujimoto,et al.  Parallel Discrete Event Simulations of Grid-Based Models: Asynchronous Electromagnetic Hybrid Code , 2004, PARA.

[5]  David M. Nicol,et al.  Performance prediction of a parallel simulator , 1999, Proceedings Thirteenth Workshop on Parallel and Distributed Simulation. PADS 99. (Cat. No.PR00155).

[6]  Kalyan S. Perumalla,et al.  /spl mu/sik - a micro-kernel for parallel/distributed simulation systems , 2005, Workshop on Principles of Advanced and Distributed Simulation (PADS'05).

[7]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.

[8]  Samuel T. King,et al.  Operating System Support for Virtual Machines , 2003, USENIX Annual Technical Conference, General Track.

[9]  Homa Karimabadi,et al.  A new asynchronous methodology for modeling of physical systems: breaking the curse of courant condition , 2005 .

[10]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[11]  David M. Nicol,et al.  Parallelized Direct Execution Simulation of Message-Passing Parallel Programs , 1996, IEEE Trans. Parallel Distributed Syst..