论文信息 - Task-Based Programming on Emerging Parallel Architectures for Finite-Differences Seismic Numerical Kernel

Task-Based Programming on Emerging Parallel Architectures for Finite-Differences Seismic Numerical Kernel

In recent years, heterogeneous hardware have generalized in almost all supercomputer nodes, requiring a profound shift on the way numerical applications are implemented. This paper, illustrates the design and implementation of a seismic wave propagation simulator, based on the finite-differences numerical scheme, and specifically tailored for such massively parallel hardware infrastructures. The application data-flow is built on top of PaRSEC, a generic task-based runtime system. The numerical kernels, designed for maximizing data reuse can efficiently leverage large SIMD units available in modern CPU cores. A strong scalability study on a cluster of Intel KNL processors illustrates the application performances.

[1] J. Virieux,et al. Dynamic faulting studied by a finite difference method : Bull seismol soc am, V72, N2, April 1982, P345–369 , 1982 .

[2] Robert W. Graves,et al. Simulating seismic wave propagation in 3D elastic media using staggered-grid finite differences , 1996, Bulletin of the Seismological Society of America.

[3] Rizos Sakellariou,et al. Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction , 2000, LCPC.

[4] J. Kristek,et al. Seismic-Wave Propagation in Viscoelastic Media with Material Discontinuities: A 3D Fourth-Order Staggered-Grid Finite-Difference Modeling , 2003 .

[5] T. Furumura,et al. Large Scale Parallel Simulation and Visualization of 3D Seismic Wavefield \\ Using the Earth Simulator , 2004 .

[6] Sally A. McKee,et al. Reflections on the memory wall , 2004, CF '04.

[7] Jean Roman,et al. Exploiting Intensive Multithreading for the Efficient Simulation of 3D Seismic Wave Propagation , 2008, 2008 11th IEEE International Conference on Computational Science and Engineering.

[8] Philip Ross,et al. Why CPU Frequency Stalled , 2008, IEEE Spectrum.

[9] Wilfried Kirschenmann,et al. Multi-target C++ implementation of parallel skeletons , 2009 .

[10] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[11] Eduard Ayguadé,et al. Hierarchical Task-Based Programming With StarSs , 2009, Int. J. High Perform. Comput. Appl..

[12] Thomas Hérault,et al. DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[13] F. Dupros,et al. Finite difference simulations of seismic wave propagation for understanding earthquake physics and predicting ground motions: Advances and challenges , 2013 .

[14] Dirk Ribbrock,et al. Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster , 2013, J. Comput. Phys..

[15] Thomas Hérault,et al. PTG: An Abstraction for Unhindered Parallelism , 2014, 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing.

[16] Ananta Tiwari,et al. Optimizing codes on the Xeon Phi: a case-study with LAMMPS , 2015, XSEDE.

[17] Pierre Ramet,et al. 3D Cartesian Transport Sweep for Massively Parallel Architectures with PaRSEC , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[18] Jeffrey G. Arnold,et al. Code modernization and modularization of APEX and SWAT watershed simulation models , 2015 .

[19] Philippe Olivier Alexandre Navaux,et al. Seismic wave propagation simulations on low-power and performance-centric manycores , 2016, Parallel Comput..

[20] Michael Bader,et al. Petascale Local Time Stepping for the ADER-DG Finite Element Method , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[21] Peng Wang,et al. High-Frequency Nonlinear Earthquake Simulations on Petascale Heterogeneous Supercomputers , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[22] Fabrice Dupros,et al. A Multi-level Optimization Strategy to Improve the Performance of Stencil Computation , 2017, ICCS.