Scaling applications to massively parallel machines using Projections performance analysis tool

Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel efficiency in such cases. We present case studies involving NAMD, a parallel classic molecular dynamics application for large biomolecular systems, and CPAIMD, Car-Parrinello ab initio molecular dynamics application, and efforts to scale them to large number of processors. Both applications are implemented in Charm++, and the performance analysis was carried out using Projections, the performance visualization/analysis tool associated with Charm++. We showcase a series of optimizations facilitated by Projections. The resultant performance of NAMD led to a Gordon Bell award at SC 2002 with unprecedented speedup on 3000 processors with teraflops level peak performance. We also explore the techniques for applying the performance visualization/analysis tool on future generation extreme-scale parallel machines and discuss the scalability issues with Projections.

[1]  Ken Kennedy,et al.  An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs , 1995, SC.

[2]  R. Sarnath,et al.  Proceedings of the International Conference on Parallel Processing , 1992 .

[3]  Michael T. Heath,et al.  Visualizing the performance of parallel programs , 1991, IEEE Software.

[4]  Laxmikant V. Kalé,et al.  Opportunities and challenges of modern communication architectures: case study with QsNet , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[5]  T. Arias,et al.  Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and co , 1992 .

[6]  Gregory V. Wilson,et al.  Parallel Programming Using C , 1996 .

[7]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[8]  Laxmikant V. Kalé,et al.  BigSim: a parallel simulator for performance prediction of extremely large parallel machines , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[9]  Laxmikant V. Kalé,et al.  Converse: an interoperable framework for parallel programming , 1996, Proceedings of International Conference on Parallel Processing.

[10]  Laxmikant V. Kalé,et al.  Adaptive Load Balancing for MPI Programs , 2001, International Conference on Computational Science.

[11]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[12]  M. Parrinello,et al.  Ab-Initio Molecular Dynamics: Principles and Practical Implementation , 1991 .

[13]  M. Tuckerman Ab initio molecular dynamics: basic concepts, current trends and novel applications , 2002 .

[14]  Laxmikant V. Kale,et al.  Object-Based Adaptive Load Balancing for MPI Programs∗ , 2000 .

[15]  Laxmikant V. Kale,et al.  NAMD2: Greater Scalability for Parallel Molecular Dynamics , 1999 .

[16]  Laxmikant V. Kalé,et al.  Performance modeling and programming environments for petaflops computers and the Blue Gene machine , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[17]  Fabrizio Petrini,et al.  Performance Evaluation of the Quadrics Interconnection Network , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[18]  Laxmikant V. Kalé,et al.  Towards automatic performance analysis , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[19]  Laxmikant V. Kalé,et al.  NAMD: Biomolecular Simulation on Thousands of Processors , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[20]  Laxmikant V. Kalé,et al.  Supporting dynamic parallel object arrays , 2003, Concurr. Comput. Pract. Exp..