Hybrid multicore/vectorisation technique applied to the elastic wave equation on a staggered grid

In modern physics it has become common to find the solution of a problem by solving numerically a set of PDEs. Whether solving them on a finite difference grid or by a finite element approach, the main calculations are often applied to a stencil structure. In the last decade it has become usual to work with so called big data problems where calculations are very heavy and accelerators and modern architectures are widely used. Although CPU and GPU clusters are often used to solve such problems, parallelisation of any calculation ideally starts from a single processor optimisation. Unfortunately, it is impossible to vectorise a stencil structured loop with high level instructions. In this paper we suggest a new approach to rearranging the data structure which makes it possible to apply high level vectorisation instructions to a stencil loop and which results in significant acceleration. The suggested method allows further acceleration if shared memory APIs are used. We show the effectiveness of the method by applying it to an elastic wave propagation problem on a finite difference grid. We have chosen Intel architecture for the test problem and OpenMP (Open Multi-Processing) since they are extensively used in many applications.

[1]  Yijie Zhang,et al.  A 3D staggered-grid finite difference scheme for poroelastic wave equation , 2014 .

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  Kagan Tuncay,et al.  Parallel implementation of a velocity-stress staggered-grid finite-difference method for 2-D poroelastic wave propagation , 2006, Comput. Geosci..

[4]  J. Virieux P-SV wave propagation in heterogeneous media: Velocity‐stress finite‐difference method , 1986 .

[5]  R. Higdon Absorbing boundary conditions for difference approximations to the multi-dimensional wave equation , 1986 .

[6]  Vittorio Ruggiero,et al.  Numerical modelling of dynamical interaction between seismic radiation and near-surface geological structures: a parallel approach , 2002 .

[7]  Dietrich Braess Finite Elements: Introduction , 2007 .

[8]  Marcin Dabrowski,et al.  Efficient 3D stencil computations using CUDA , 2013, Parallel Comput..

[9]  Z. Alterman,et al.  Propagation of elastic waves in layered media by finite difference methods , 1968 .

[10]  José M. García,et al.  Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures , 2015, Comput. Electr. Eng..

[11]  Gernot Beer,et al.  The Boundary Element Method with Programming: For Engineers and Scientists , 2008 .

[12]  Tapio Lokki,et al.  Acoustic visualizations using surface mapping. , 2014, The Journal of the Acoustical Society of America.

[13]  Gui-Rong Liu,et al.  An Introduction to Meshfree Methods and Their Programming , 2005 .

[14]  Massimo Ruzzene,et al.  Spectral Finite Element Method , 2011 .

[15]  Igor G. Chernykh,et al.  AstroPhi: A code for complex simulation of the dynamics of astrophysical objects using hybrid supercomputers , 2015, Comput. Phys. Commun..

[16]  B. M. Fulk MATH , 1992 .

[17]  J. Strikwerda Finite Difference Schemes and Partial Differential Equations, Second Edition , 2004 .

[18]  Hideki Tachibana,et al.  Visualization of sound reflection and diffraction using finite difference time domain method , 2002 .

[19]  Gerassimos A. Athanassoulis,et al.  4th International Young Scientists Conference on Computational Science , 2015 .

[20]  Rolf Krause,et al.  A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA , 2014, Appl. Math. Comput..

[21]  A. Levander Fourth-order finite-difference P-SV seismograms , 1988 .

[22]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[23]  Philippe Thierry,et al.  Characterization and Optimization Methodology Applied to Stencil Computations , 2015 .

[25]  Børge Arntsen,et al.  Three-dimensional elastic full waveform inversion using seismic data from the Sleipner area , 2015 .

[26]  K. R. Kelly,et al.  SYNTHETIC SEISMOGRAMS: A FINITE ‐DIFFERENCE APPROACH , 1976 .

[27]  Robert W. Graves,et al.  Simulating seismic wave propagation in 3D elastic media using staggered-grid finite differences , 1996, Bulletin of the Seismological Society of America.

[28]  Ismail H Tuncer Parallel computational fluid dynamics 2007 : implementations and experiences on large scale and grid computing , 2009 .

[29]  Franz Franchetti,et al.  Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures , 2011, CC.

[30]  Dimitri Komatitsch,et al.  Accelerating a three-dimensional finite-difference wave propagation code using GPU graphics cards , 2010 .

[31]  Peter Moczo,et al.  Stability and Grid Dispersion of the P-SV 4th-Order Staggered-Grid Finite-Difference Schemes , 2000 .

[32]  Fabrice Dupros,et al.  MPI-OpenMP hybrid simulations using boundary integral equation and finite difference methods for earthquake dynamics and wave propagation: Application to the 2007 Niigata Chuetsu-Oki earthquake (Mw6.6) , 2011, ICCS.

[33]  J. Strikwerda Finite Difference Schemes and Partial Differential Equations , 1989 .

[34]  Albert Farrés,et al.  Finite-difference staggered grids in GPUs for anisotropic elastic wave propagation simulation , 2014, Comput. Geosci..

[35]  Frederico Pratas,et al.  Cache-aware Roofline model: Upgrading the loft , 2014, IEEE Computer Architecture Letters.

[36]  A. Gorobets,et al.  A parallel MPI + OpenMP + OpenCL algorithm for hybrid supercomputations of incompressible flows , 2013 .

[37]  Claus-Dieter Munz,et al.  Parallel Coupling of Heterogeneous Domains with KOP3D using PACX-MPI , 2009 .

[38]  M. Malovichko,et al.  Solution of Large-scale Seismic Modeling Problems☆ , 2015 .

[39]  Heiner Igel,et al.  Anisotropic wave propagation through finite-difference grids , 1995 .