The power of high-performance computing (HPC) is applied to simulate highly complex systems and processes in many scientific communities, e. g. in particle physics, weather and climate research, bio-sciences, materials science, pharmaceutics, astronomy, or finance.
Current HPC systems are so complex that the design of such a system, including architecture design space exploration and performance prediction, requires HPC-like simulation capabilities. To this end, we developed an Omnest-based simulation environment that enables studying the impact of an HPC machine's communication subsystem on the overall system's performance for specific workloads.
As the scale of current high-end HPC systems is in the range of hundreds of thousands of processing cores, full system simulation---at an abstraction level that still maintains a reasonably high level of detail---is infeasible without resorting to parallel simulation, the main limiting factors being simulation run time and memory footprint.
We describe our experiences in adapting our simulation environment to take advantage of the parallel distributed simulation capabilities provided by Omnest. We present results obtained on a many-core SMP machine as well as a small-scale InfiniBand cluster.
Furthermore, we ported our simulation environment, including Omnest itself, to the massively parallel IBM Blue Gene®/P platform. We report results from initial experiments on this platform using up to 512 cores in parallel.
[1]
K. Mani Chandy,et al.
Distributed Simulation: A Case Study in Design and Verification of Distributed Programs
,
1979,
IEEE Transactions on Software Engineering.
[2]
Jian Li,et al.
A framework for end-to-end simulation of high-performance computing systems
,
2008,
Simutools 2008.
[3]
Philip Heidelberger,et al.
Blue Gene/L torus interconnection network
,
2005,
IBM J. Res. Dev..
[4]
Jack Dongarra,et al.
Top500 Supercomputer Sites
,
1997
.
[5]
Fabrizio Petrini,et al.
k-ary n-trees: high performance networks for massively parallel architectures
,
1997,
Proceedings 11th International Parallel Processing Symposium.
[6]
Richard M. Fujimoto,et al.
Conservative synchronization of large-scale network simulations
,
2004,
18th Workshop on Parallel and Distributed Simulation, 2004. PADS 2004..
[7]
Andras Varga,et al.
A practical efficiency criterion for the null message algorithm
,
2003
.
[8]
George L.-T. Chiu,et al.
Overview of the Blue Gene/L system architecture
,
2005,
IBM J. Res. Dev..
[9]
German Rodriguez,et al.
Trace-driven co-simulation of high-performance computing systems using OMNeT++
,
2009,
SIMUTools 2009.