Experiences at scale with PGAS versions of a Hydrodynamics application

In this work we directly evaluate two PGAS programming models, CAF and OpenSHMEM, as candidate technologies for improving the performance and scalability of scientific applications on future exascale HPC platforms. PGAS approaches are considered by many to represent a promising research direction with the potential to solve some of the existing problems preventing codebases from scaling to exascale levels of performance. The aim of this work is to better inform the exacsale planning at large HPC centres such as AWE. Such organisations invest significant resources maintaining and updating existing scientific codebases, many of which were not designed to run at the scales required to reach exascale levels of computational performance on future system architectures. We document our approach for implementing a recently developed Lagrangian-Eulerian explicit hydrodynamics mini-application in each of these PGAS languages. Furthermore, we also present our results and experiences from scaling these different approaches to high node counts on two state-of-the-art, large scale system architectures from Cray (XC30) and SGI (ICE-X), and compare their utility against an equivalent existing MPI implementation.

[1]  Stephen A. Jarvis,et al.  Optimising Hydrodynamics applications for the Cray XC30 with the application tool suite , 2014 .

[2]  Victor Luchangco,et al.  The Fortress Language Specification Version 1.0 , 2007 .

[3]  B. V. Leer,et al.  Towards the ultimate conservative difference scheme V. A second-order sequel to Godunov's method , 1979 .

[4]  Sandia Report,et al.  Improving Performance via Mini-applications , 2009 .

[5]  James Demmel,et al.  Avoiding communication in sparse matrix computations , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[6]  D. Henty Performance of Fortran Coarrays on the Cray XE 6 , 2012 .

[7]  Swaroop Pophale,et al.  Hybrid Programming Using OpenSHMEM and OpenACC , 2014, OpenSHMEM.

[8]  Jesper Larsson Träff,et al.  MPI on a Million Processors , 2009, PVM/MPI.

[9]  Dhabaleswar K. Panda,et al.  Supporting Hybrid MPI and OpenSHMEM over InfiniBand: Design and Performance Evaluation , 2012, 2012 41st International Conference on Parallel Processing.

[10]  John Shalf,et al.  Multithreaded global address space communication techniques for Gyrokinetic fusion applications on ultra-scale platforms , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[11]  Mats Hamrud,et al.  A PGAS Implementation by Co-design of the ECMWF Integrated Forecasting System (IFS) , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[12]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[13]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[14]  Nicholas J. Higham,et al.  Performance analysis of asynchronous Jacobi’s method implemented in MPI, SHMEM and OpenMP , 2014, Int. J. High Perform. Comput. Appl..

[15]  Stephen A. Jarvis,et al.  Towards Portable Performance for Explicit Hydrodynamics Codes , 2013 .

[16]  Barbara M. Chapman,et al.  Introducing OpenSHMEM: SHMEM for the PGAS community , 2010, PGAS '10.

[17]  Stephen A. Jarvis,et al.  CloverLeaf: Preparing Hydrodynamics Codes for Exascale , 2013 .

[18]  Stephen A. Jarvis,et al.  Accelerating Hydrocodes with OpenACC, OpenCL and CUDA , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[19]  Michelle Mills Strout,et al.  Evaluating Coarray Fortran with the CGPOP Miniapp , 2011 .

[20]  Richard Barrett Co-Array Fortran Experiences with Finite Differencing Methods , 2006 .