Trace-driven co-simulation of high-performance computing systems using OMNeT++

In the context of developing next-generation high-performance computing systems, there is often a need for an "end-to-end" simulation tool that can simulate the behaviour of a full application on a reasonably faithful model of the actual system. Considering the ever-increasing levels of parallelism, we take a communication-centric view of the system based on collecting application traces at the message-passing interface level. We present an integrated toolchain that enables the evaluation of the impact of all interconnection network aspects on the performance of parallel applications. The network simulator, based on OMNeT++, provides a socket-based co-simulation interface to the MPI task simulator, which replays traces obtained using an instrumentation package. Both simulators generate output that can be evaluated with a visualization tool. A set of additional tools is provided to translate generic topology files to OMNeT's ned format, import route files at run time, perform routing optimizations, and generate particular topologies. We also present several examples of results obtained that provide insights that would not have been possible without this integrated environment.

[1]  Torsten Hoefler,et al.  Adaptive Routing Strategies for Modern High Performance Networks , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[2]  Fabrizio Petrini,et al.  k-ary n-trees: high performance networks for massively parallel architectures , 1997, Proceedings 11th International Parallel Processing Symposium.

[3]  Karthick Rajamani,et al.  Application of full-system simulation in exploratory system design and development , 2006, IBM J. Res. Dev..

[4]  Jian Li,et al.  A framework for end-to-end simulation of high-performance computing systems , 2008, Simutools 2008.

[5]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[6]  Andras Varga,et al.  A practical efficiency criterion for the null message algorithm , 2003 .

[7]  Pavan Balaji,et al.  Are nonblocking networks really needed for high-end-computing workloads? , 2008, 2008 IEEE International Conference on Cluster Computing.

[8]  Mohan Kumar,et al.  On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.

[9]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[10]  A. Varga,et al.  THE OMNET++ DISCRETE EVENT SIMULATION SYSTEM , 2003 .

[11]  Mineo Takai,et al.  Performance Evaluation of Conservative Algorithms in Parallel Simulation Languages , 2000, IEEE Trans. Parallel Distributed Syst..