Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation

Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison.

[1]  Torsten Hoefler,et al.  Generic topology mapping strategies for large-scale parallel architectures , 2011, ICS '11.

[2]  Sameer Kumar,et al.  Acceleration of an Asynchronous Message Driven Programming Paradigm on IBM Blue Gene/Q , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[3]  Amith R. Mamidala,et al.  PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[4]  Laxmikant V. Kale,et al.  Automating Topology Aware Mapping for Supercomputers , 2010 .

[5]  Abhishek Gupta,et al.  Parallel Programming with Migratable Objects: Charm++ in Practice , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  Klaus Schulten,et al.  Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics , 2013, Nature.

[7]  Danielle E. Chandler,et al.  Light harvesting by lamellar chromatophores in Rhodospirillum photometricum. , 2014, Biophysical journal.

[8]  Klaus Schulten,et al.  Adapting a message-driven parallel application to GPU-accelerated clusters , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[9]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[10]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD on the IBM Blue Gene/L system , 2008, IBM J. Res. Dev..

[11]  Lukasz Wesolowski,et al.  Charm + + for Productivity and Performance A Submission to the 2011 HPC Class II Challenge , 2011 .

[12]  Philip Heidelberger,et al.  The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[13]  Gengbin Zheng,et al.  A uGNI-based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[14]  José E. Moreira,et al.  Topology Mapping for Blue Gene/L Supercomputer , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[15]  Laxmikant V. Kalé,et al.  Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtime , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[16]  Laxmikant V. Kalé,et al.  Dynamic topology aware load balancing algorithms for molecular dynamics applications , 2009, ICS.

[17]  Laxmikant V. Kalé,et al.  Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[18]  Larry Kaplan,et al.  The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[19]  Lei Huang,et al.  Generalized scalable multiple copy algorithms for molecular dynamics simulations in NAMD , 2014, Comput. Phys. Commun..

[20]  Laxmikant V. Kalé,et al.  Topology-aware task mapping for reducing communication contention on large parallel machines , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[21]  Eric J. Bohm,et al.  Scalable Molecular Dynamics with NAMD on Blue Gene / L , 2012 .

[22]  Klaus Schulten,et al.  Accelerating Molecular Modeling Applications with GPU Computing , 2009 .

[23]  Laxmikant V. Kalé,et al.  Fine-grained parallelization of the Car - Parrinello ab initio molecular dynamics method on the IBM Blue Gene/L supercomputer , 2008, IBM J. Res. Dev..

[24]  Chao Mei,et al.  Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6 , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.