Performance Assessment of Hybrid Parallelism for Large-Scale Reservoir Simulation on Multi- and Many-core Architectures

Two trends are reshaping the landscape of petroleum reservoir simulators, one architecturally and one application driven: an increasing number of cores per node and increasing computational intensity arising from higher fidelity physics at each cell. Implicit algebraic solvers being the dominant kernels, we present hybrid MPI and OpenMP implementations of the linear solver of GigaPOWERS, a full-scale real-world massively parallel simulator for black oil and composition models. We also evaluate the impact of explicit communication and computation overlap by including the halo exchange in the task-dependency graph. We analyze the performance of these modifications across multi- and many-core architectures, i.e., Intel Haswell, Skylake, and Knights Landing, using a variety of synthetic and real-world models. The hybrid approach results in up to 50% reduction of time to solution on a 16 million-cell SPE10-like model on Skylake whereas on a smaller, 1 million-cell, model on Haswell and Knights Landing both implementations achieve very similar performance. In the real-world reservoir simulations, the hybrid parallelism has reduced communication volume, memory consumption, and improved load balancing.

[1]  Larry S.K. Fung,et al.  Multiparadigm Parallel Acceleration for Reservoir Simulation , 2014 .

[2]  Arthur Moncorge,et al.  Reservoir Simulation Prototyping Platform for High Performance Computing , 2014 .

[3]  David E. Keyes,et al.  Asynchronous Task-Based Parallelization of Algebraic Multigrid , 2017, PASC.

[4]  Gerhard Wellein,et al.  Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP Programming , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[5]  Larry S.K. Fung,et al.  A Next-Generation Parallel Reservoir Simulator for Giant Reservoirs , 2009 .

[6]  Sabela Ramos,et al.  Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[7]  Barbara M. Chapman,et al.  Performance modeling of communication and computation in hybrid MPI and OpenMP applications , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[8]  P.K.W. Vinsome,et al.  Block iterative methods for fully implicit reservoir simulation , 1982 .

[9]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[10]  Larry S.K. Fung,et al.  Parallel Unstructured-Solver Methods for Simulation of Complex Giant Reservoirs , 2008 .

[11]  Georg Hager,et al.  Hybrid MPI and OpenMP Parallel Programming , 2006, PVM/MPI.

[12]  Keith D. Underwood,et al.  An analysis of the impact of MPI overlap and independent progress , 2004, ICS '04.

[13]  Martin Schulz,et al.  Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP , 2012, 2012 41st International Conference on Parallel Processing.

[14]  Zhangxin Chen,et al.  A Parallel Framework for Reservoir Simulators on Distributed-memory Supercomputers , 2015 .

[15]  Torsten Hoefler,et al.  MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory , 2013, Computing.

[16]  Michael Andrew Christie,et al.  Tenth SPE Comparative Solution Project: a comparison of upscaling techniques , 2001 .

[17]  J. R. Wallis,et al.  Incomplete Gaussian Elimination as a Preconditioning for Generalized Conjugate Gradient Acceleration , 1983 .

[18]  Vladislav I. Dzyuba,et al.  Advances in Modeling of Giant Reservoirs , 2012 .