论文信息 - Enabling Parallel Simulation of Large-Scale HPC Network Systems

Enabling Parallel Simulation of Large-Scale HPC Network Systems

With the increasing complexity of today's high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems-in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks used in today's IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today's high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations.

[1] D. Roweth,et al. Cray XC ® Series Network , 2012 .

[2] Torsten Hoefler,et al. Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[3] Christopher D. Carothers,et al. ROSS: a high-performance, low memory, modular time warp system , 2000, PADS '00.

[4] William J. Dally,et al. Cost-Efficient Dragonfly Topology for Large-Scale Systems , 2009, IEEE Micro.

[5] Henri Casanova,et al. Single Node On-Line Simulation of MPI Applications with SMPI , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[6] Christian Engelmann,et al. Supporting the Development of Resilient Message Passing Applications Using Simulation , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[7] Franck Cappello,et al. On Communication Determinism in Parallel HPC Applications , 2010, 2010 Proceedings of 19th International Conference on Computer Communications and Networks.

[8] William J. Dally,et al. Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[9] Robert B. Ross,et al. Using massively parallel simulation for mpi collective communication modeling in extreme-scale networks , 2014, Proceedings of the Winter Simulation Conference 2014.

[10] Jeffrey S. Vetter,et al. Aspen: A domain specific language for performance modeling , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[11] David R. Jefferson,et al. Virtual time , 1985, ICPP.

[12] Robert Latham,et al. Techniques for modeling large-scale HPC I/O workloads , 2015, PMBS '15.

[13] Cyriel Minkenberg,et al. Trace-driven co-simulation of high-performance computing systems using OMNeT++ , 2009, SimuTools.

[14] Misbah Mubarak,et al. Preliminary Evaluation of a Parallel Trace Replay Tool for HPC Network Simulations , 2015, Euro-Par Workshops.

[15] Robert B. Ross,et al. CODES: Enabling Co-Design of Multi-Layer Exascale Storage Architectures , 2011 .

[16] Christopher D. Carothers,et al. Warp speed: executing time warp on 1,966,080 cores , 2013, SIGSIM-PADS.

[17] Mateo Valero,et al. On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks , 2012, 2012 41st International Conference on Parallel Processing.

[18] Sadaf R. Alam,et al. Characterization of Scientific Workloads on Systems with Multi-Core Processors , 2006, 2006 IEEE International Symposium on Workload Characterization.

[19] William Gropp,et al. Reproducible Measurements of MPI Performance Characteristics , 1999, PVM/MPI.

[20] John Kim,et al. Overcoming far-end congestion in large-scale networks , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[21] Robert B. Ross,et al. A case study in using massively parallel simulation for extreme-scale torus network codesign , 2014, SIGSIM PADS '14.

[22] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .

[23] Amith R. Mamidala,et al. Looking under the hood of the IBM Blue Gene/Q network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[24] Torsten Hoefler,et al. Cost-effective diameter-two topologies: analysis and evaluation , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[25] Christian Engelmann,et al. xSim: The extreme-scale simulator , 2011, 2011 International Conference on High Performance Computing & Simulation.

[26] Philip Heidelberger,et al. Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..

[27] Laura Carrington,et al. A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..

[28] Laxmikant V. Kalé,et al. BigSim: a parallel simulator for performance prediction of extremely large parallel machines , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[29] Laxmikant V. Kalé,et al. Avoiding hot-spots on two-level direct networks , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[30] Bruce Jacob,et al. The structural simulation toolkit , 2006, PERV.

[31] Ibm Redbooks,et al. IBM System Blue Gene Solution: Blue Gene/P Application Development , 2009 .

[32] Sadaf R. Alam,et al. Cray XT4: an early evaluation for petascale scientific simulation , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[33] Courtenay T. Vaughan,et al. Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[34] B RossRobert,et al. Enabling Parallel Simulation of Large-Scale HPC Network Systems , 2017 .

[35] Christopher D. Carothers,et al. ROSS: a high-performance, low memory, modular time warp system , 2000, Proceedings Fourteenth Workshop on Parallel and Distributed Simulation.

[36] Robert B. Ross,et al. Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[37] William J. Dally,et al. Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[38] William J. Dally,et al. The torus routing chip , 2005, Distributed Computing.

[39] Philip Heidelberger,et al. The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[40] Robert Birke,et al. Towards massively parallel simulations of massively parallel high-performance computing systems , 2012, SimuTools.