Parallel Simulations of Dynamic Earthquake Rupture along Geometrically Complex Faults on CMP Systems

Chip multiprocessors (CMP) are widely used for high performance computing and are being configured in a hierarchical manner to compose a CMP compute node in a CMP system. Such a CMP system provides a natural programming paradigm for hybrid MPI/OpenMP applications. In this paper, we use OpenMP to parallelize a sequential earthquake simulation code for modeling spontaneous earthquake rupture along geometrically complex faults on two CMP systems, IBM POWER5+ system and SUN Opteron server. The experimental results indicate that the OpenMP implementation has the accurate output results and the good scalability on the two CMP systems. We apply the optimization techniques such as large page and processor binding to the OpenMP implementation to achieve up to 7.05% performance improvement on the CMP systems without any code modification. Further, we illustrate an element-based partitioning scheme for explicit finite element methods. Based on the partitioning scheme and what we learn from the OpenMP implementation, we discuss how efficiently to use hybrid MPI/OpenMP to parallelize the sequential earthquake rupture simulation code in order to not only achieve multiple levels of parallelism of the code but also to reduce the communication overhead of MPI within a CMP node by taking advantage of the shared address space and on-chip high inter-core bandwidth and low inter-core latency. Our initial experimental results indicate that the hybrid MPI/OpenMP implementation obtains the accurate output results and has good scalability on CMP systems.

[1]  Xingfu Wu,et al.  Performance Analysis and Optimization of Parallel Scientific Applications on CMP Clusters , 2009, Scalable Comput. Pract. Exp..

[2]  Chris H. Q. Ding,et al.  An element-based concurrent partitioner for unstructured finite element meshes , 1996, Proceedings of International Conference on Parallel Processing.

[3]  John Shalf,et al.  Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture , 2007 .

[4]  Michelle R. Hribar,et al.  Balancing Load versus Decreasing Communication: Parameterizing the Tradeoff , 2001, J. Parallel Distributed Comput..

[5]  B. Duan,et al.  Nonuniform prestress from prior earthquakes and the effect on dynamics of branched fault systems , 2007 .

[6]  A. Pitarka,et al.  The SCEC/USGS Dynamic Earthquake Rupture Code Verification Exercise , 2012 .

[7]  David R. O'Hallaron,et al.  High Resolution Forward And Inverse Earthquake Modeling on Terascale Computers , 2003, SC.

[8]  Kengo Nakajima OpenMP / MPI Hybrid vs. Flat MPI on the Earth Simulator: Parallel Iterative Solvers for Finite Element Method , 2003, ISHPC.

[9]  S. Day,et al.  Inelastic strain distribution and seismic radiation from rupture of a fault kink , 2008 .

[10]  B. Duan,et al.  Heterogeneous fault stresses from previous earthquakes and the effect on dynamics of parallel strike‐slip faults , 2006 .

[11]  Kunle Olukotun,et al.  Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency , 2007 .

[12]  G. Mahinthakumar,et al.  A Hybrid Mpi-Openmp Implementation of an Implicit Finite-Element Code on Parallel Architectures , 2002, Int. J. High Perform. Comput. Appl..

[13]  Tiankai Tu,et al.  High Resolution Forward And Inverse Earthquake Modeling on Terascale Computers , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[14]  David R. O'Hallaron,et al.  Materialized community ground models for large-scale earthquake simulation , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[15]  B. Duan Role of initial stress rotations in rupture dynamics and ground motion: A case study with implications for the Wenchuan earthquake , 2010 .

[16]  Jing Zhu,et al.  Toward petascale earthquake simulations , 2009 .