BookLeaf: An Unstructured Hydrodynamics Mini-Application

With the age of Exascale computing causing a diversification away from traditional CPU-based homogeneous clusters, it is becoming increasingly difficult to ensure that computationally complex codes are able to run on these emerging architectures. This is especially important for large physics simulations that are themselves becoming increasingly complex and computationally expensive. One proposed solution to the problem of ensuring these applications can run on the desired architectures is to develop representative mini-applications that are simpler and so can be ported to new frameworks more easily, but which are also representative of the algorithmic and performance characteristics of the original applications. In this paper we present BookLeaf, an unstructured Arbitrary Lagrangian-Eulerian mini-application to add to the suite of representative applications developed and maintained by the UK Mini-App Consortium (UK-MAC). First, we outline the reference implementation of our application in Fortran. We then discuss a number of alternative implementations using a variety of parallel programming models and discuss the issues that arise when porting such an application to new architectures. To demonstrate our implementation, we present a study of the performance of BookLeaf on number of platforms using alternative designs, and we document a scaling study showing the behaviour of the application at scale.

[1]  Stephen A. Jarvis,et al.  CloverLeaf: Preparing Hydrodynamics Codes for Exascale , 2013 .

[2]  Stephen A. Jarvis,et al.  Enabling portable I/O analysis of commercially sensitive HPC applications through workload replication , 2017 .

[3]  D. Benson An efficient, accurate, simple ALE method for nonlinear finite element programs , 1989 .

[4]  G. Taylor The formation of a blast wave by a very intense explosion I. Theoretical discussion , 1950, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[5]  Stephen A. Jarvis,et al.  Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[6]  John K. Dukowicz,et al.  Vorticity errors in multidimensional Lagrangian codes , 1992 .

[7]  Matt Martineau,et al.  Assessing the performance portability of modern parallel programming models using TeaLeaf , 2017, Concurr. Comput. Pract. Exp..

[8]  R. Hornung,et al.  HYDRODYNAMICS CHALLENGE PROBLEM , 2011 .

[9]  Thomas L. Sterling,et al.  BEOWULF: A Parallel Workstation for Scientific Computation , 1995, ICPP.

[10]  Remzi H. Arpaci-Dusseau,et al.  Architectural Requirements and Scalability of the NAS Parallel Benchmarks , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[11]  Richard D. Hornung,et al.  The RAJA Portability Layer: Overview and Status , 2014 .

[12]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[13]  Steven J. Plimpton,et al.  miniMD v. 1.0 , 2009 .

[14]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[15]  Sandia Report,et al.  Improving Performance via Mini-applications , 2009 .

[16]  Philipp Samfass Porting AMG2013 to Heterogeneous CPU+GPU Nodes , 2017 .

[17]  Stephen A. Jarvis,et al.  Resident Block-Structured Adaptive Mesh Refinement on Thousands of Graphics Processing Units , 2015, 2015 44th International Conference on Parallel Processing.

[18]  Mikhail Shashkov,et al.  Formulations of Artificial Viscosity for Multi-dimensional Shock Wave Computations , 1998 .

[19]  Martin Schulz,et al.  Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[20]  Stephen A. Jarvis,et al.  Achieving Performance Portability for a Heat Conduction Solver Mini-Application on Modern Multi-core Systems , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[21]  Nicholas J. Wright,et al.  A programming model performance study using the NAS parallel benchmarks , 2010, Sci. Program..

[22]  W. F. Noh Errors for calculations of strong shocks using an artificial viscosity and artificial heat flux , 1985 .

[23]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[24]  Richard F. Barrett,et al.  Exascale design space exploration and co-design , 2014, Future Gener. Comput. Syst..

[25]  Matt Martineau,et al.  Evaluating OpenMP 4.0's Effectiveness as a Heterogeneous Parallel Programming Model , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[26]  Andrew Barlow,et al.  An adaptive multi-material Arbitrary Lagrangian Eulerian algorithm for computational shock hydrodynamics , 2002 .

[27]  D. A. Beckingsale,et al.  TeaLeaf: A Mini-Application to Enable Design-Space Explorations for Iterative Sparse Linear Solvers , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[28]  Stephen A. Jarvis,et al.  Replicating HPC I/O Workloads with Proxy Applications , 2016, 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS).

[29]  A. J. Barlow,et al.  A compatible finite element multi‐material ALE hydrodynamics algorithm , 2008 .

[30]  B. V. Leer,et al.  Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection , 1977 .

[31]  Stephen A. Jarvis,et al.  On the Acceleration of Wavefront Applications using Distributed Many-Core Architectures , 2012, Comput. J..

[32]  M. Shashkov,et al.  Elimination of Artificial Grid Distortion and Hourglass-Type Motions by Means of Lagrangian Subzonal Masses and Pressures , 1998 .

[33]  G. Sod A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws , 1978 .