Automatic Verification of Self-consistent MPI Performance Guidelines

The Message Passing Interface MPI is the most commonly used application programming interface for process communication on current large-scale parallel systems. Due to the scale and complexity of modern parallel architectures, it is becoming increasingly difficult to optimize MPI libraries, as many factors can influence the communication performance. To assist MPI developers and users, we propose an automatic way to check whether MPI libraries respect self-consistent performance guidelines for collective communication operations. We introduce the PGMPI framework to detect violations of performance guidelines through benchmarking. Our experimental results show that PGMPI can pinpoint undesired and often unexpected performance degradations of collective MPI operations. We demonstrate how to overcome performance issues of several libraries by adapting the algorithmic implementations of their respective collective MPI calls.

[1]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .

[2]  Jie Wang,et al.  Optimizing MPI Runtime Parameter Settings by Using Machine Learning , 2009, PVM/MPI.

[3]  Robert A. van de Geijn,et al.  Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..

[4]  Torsten Hoefler,et al.  Exascaling Your Library: Will Your Implementation Meet Your Expectations? , 2015, ICS.

[5]  Edgar Gabriel,et al.  A Tool for Optimizing Runtime Parameters of Open MPI , 2008, PVM/MPI.

[6]  Jesper Larsson Träff,et al.  SKaMPI: a comprehensive benchmark for public benchmarking of MPI , 2002, Sci. Program..

[7]  Torsten Hoefler,et al.  Performance Expectations and Guidelines for MPI Derived Datatypes , 2011, EuroMPI.

[8]  Jesper Larsson Träff,et al.  Self-Consistent MPI Performance Guidelines , 2010, IEEE Transactions on Parallel and Distributed Systems.

[9]  Xin Yuan,et al.  Bandwidth optimal all-reduce algorithms for clusters of workstations , 2009, J. Parallel Distributed Comput..

[10]  Sascha Hunold,et al.  Reproducible MPI Benchmarking is Still Not as Easy as You Think , 2016, IEEE Transactions on Parallel and Distributed Systems.

[11]  Jack J. Dongarra,et al.  Performance analysis of MPI collective operations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[12]  Xin Yuan,et al.  STAR-MPI: self tuned adaptive routines for MPI collective operations , 2006, ICS '06.

[13]  Robert A. van de Geijn,et al.  Collective communication: theory, practice, and experience: Research Articles , 2007 .

[14]  Robert B. Ross,et al.  Self-consistent MPI-IO Performance Requirements and Expectations , 2008, PVM/MPI.

[15]  Jesper Larsson Träff,et al.  PGMPI: Automatically Verifying Self-Consistent MPI Performance Guidelines , 2016, ArXiv.

[16]  Jesper Larsson Träff,et al.  mpicroscope: Towards an MPI Benchmark Tool for Performance Guideline Verification , 2012, EuroMPI.