Enhancing speed and scalability of the ParFlow simulation code

Regional hydrology studies are often supported by high-resolution simulations of subsurface flow that require expensive and extensive computations. Efficient usage of the latest high performance parallel computing systems becomes a necessity. The simulation software ParFlow has been demonstrated to meet this requirement and shown to have excellent solver scalability for up to 16,384 processes. In the present work, we show that the code requires further enhancements in order to fully take advantage of current petascale machines. We identify ParFlow’s way of parallelization of the computational mesh as a central bottleneck. We propose to reorganize this subsystem using fast mesh partition algorithms provided by the parallel adaptive mesh refinement library p4est. We realize this in a minimally invasive manner by modifying selected parts of the code to reinterpret the existing mesh data structures. We evaluate the scaling performance of the modified version of ParFlow, demonstrating good weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test an example application at large scale.

[1]  Carl Tim Kelley,et al.  Numerical simulation of water resources problems: Models, methods, and trends , 2013 .

[2]  R. Maxwell,et al.  Capturing the influence of groundwater dynamics on land surface processes using an integrated, distributed watershed model , 2008 .

[3]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[4]  Morris Muskat,et al.  Physical principles of oil production , 1949 .

[5]  Jack Dongarra,et al.  Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2014, HiPC 2014.

[6]  Timothy C. Warburton,et al.  Extreme-Scale AMR , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Michel Quintard,et al.  An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale , 2014, Comput. Phys. Commun..

[8]  R. Ababou,et al.  Implementation of the three‐dimensional turning bands random field generator , 1989 .

[9]  Michael Gschwind,et al.  The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[10]  Jan Vanderborght,et al.  Proof of concept of regional scale hydrologic simulations at hydrologic resolution utilizing massively parallel computer resources , 2010 .

[11]  Jutta Docter,et al.  JUQUEEN: IBM Blue Gene/Q® Supercomputer System at the Jülich Supercomputing Centre , 2015 .

[12]  C. Paniconi,et al.  Surface‐subsurface flow modeling with path‐based runoff routing, boundary condition‐based coupling, and assimilation of multisource observation data , 2010 .

[13]  R. Dickinson,et al.  The Common Land Model , 2003 .

[14]  L. A. Richards Capillary conduction of liquids through porous mediums , 1931 .

[15]  G E Hammond,et al.  Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN , 2014, Water resources research.

[16]  Francis X. Giraldo,et al.  Strong scaling for numerical weather prediction at petascale with the atmospheric model NUMA , 2015, Int. J. High Perform. Comput. Appl..

[17]  Xiaofeng Liu,et al.  Parallel Modeling of Three-dimensional Variably Saturated Ground Water Flows with Unstructured Mesh using Open Source Finite Volume Platform Openfoam , 2013 .

[18]  R. Maxwell,et al.  Integrated surface-groundwater flow modeling: A free-surface overland flow boundary condition in a parallel groundwater flow model , 2006 .

[19]  Vipin Kumar,et al.  A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering , 1998, J. Parallel Distributed Comput..

[20]  Shaul Sorek,et al.  Quasi 3D modeling of water flow in vadose zone and groundwater , 2012 .

[21]  Jim E. Jones,et al.  Approved for Public Release; Further Dissemination Unlimited Newton-krylov-multigrid Solvers for Large-scale, Highly Heterogeneous, Variably Saturated Flow Problems , 2022 .

[22]  Carsten Burstedde,et al.  p4est: Scalable Algorithms for Parallel Adaptive Mesh Refinement on Forests of Octrees , 2011, SIAM J. Sci. Comput..

[23]  Carol S. Woodward,et al.  Improved numerical solvers for implicit coupling of subsurface and overland flow , 2014 .

[24]  Carsten Burstedde,et al.  Recursive Algorithms for Distributed Forests of Octrees , 2014, SIAM J. Sci. Comput..

[25]  S. Ashby,et al.  A parallel multigrid preconditioned conjugate gradient algorithm for groundwater flow simulations , 1996 .

[26]  Jan Vanderborght,et al.  PARSWMS: A Parallelized Model for Simulating Three‐Dimensional Water Flow and Solute Transport in Variably Saturated Soils , 2007 .

[27]  Carsten Burstedde,et al.  Bounds on the number of discontinuities of Morton-type space-filling curves , 2015, 1505.05055.

[28]  Karsten Pruess,et al.  User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code , 2008 .

[29]  Markus Geimer,et al.  Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP v1.0) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q) , 2014 .

[30]  Carol S. Woodward,et al.  Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers , 2020, ACM Trans. Math. Softw..

[31]  Kenzi Karasaki,et al.  Numerical investigation for the impact of CO2 geologic sequestration on regional groundwater flow , 2009 .

[32]  R. Maxwell A terrain-following grid transform and preconditioner for parallel, large-scale, integrated hydrologic modeling , 2013 .

[33]  Peter A. Forsyth,et al.  A parallel computational framework to solve flow and transport in integrated surface-subsurface hydrologic systems , 2012, Environ. Model. Softw..

[34]  Bernd Mohr,et al.  The Scalasca performance toolset architecture , 2010, Concurr. Comput. Pract. Exp..

[35]  Matthew G. Knepley,et al.  PETSc Users Manual (Rev. 3.3) , 2013 .

[36]  Carsten Burstedde,et al.  Morton curve segments produce no more than two distinct face-connected subdomains , 2015, ArXiv.

[37]  Bernd Mohr,et al.  Cube v4: From Performance Report Explorer to Performance Analysis Tool , 2015, ICCS.

[38]  Constantine Bekas,et al.  An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[39]  BursteddeCarsten,et al.  p4est: Scalable Algorithms for Parallel Adaptive Mesh Refinement on Forests of Octrees , 2011 .