A parallel shared-memory implementation of a high-order accurate solution technique for variable coefficient Helmholtz problems

The recently developed Hierarchical Poincare-Steklov (HPS) method is a high-order discretization technique that comes with a direct solver. Results from previous papers demonstrate the method's ability to solve Helmholtz problems to high accuracy without the so-called pollution effect. While the asymptotic scaling of the direct solver's computational cost is the same as the nested dissection method, serial implementations of the solution technique are not practical for large scale numerical simulations. This manuscript presents the first parallel implementation of the HPS method. Specifically, we introduce an approach for a shared memory implementation of the solution technique utilizing parallel linear algebra. This approach is the foundation for future large scale simulations on supercomputers and clusters with large memory nodes. Performance results on a desktop computer (resembling a large memory node) are presented.

[1]  Eric Darve,et al.  An O(NlogN)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal O (N \log N)$$\end{document} Fast Direct Solver fo , 2013, Journal of Scientific Computing.

[2]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[3]  Per-Gunnar Martinsson,et al.  A Direct Solver with O(N) Complexity for Variable Coefficient Elliptic PDEs Discretized via a High-Order Composite Spectral Collocation Method , 2013, SIAM J. Sci. Comput..

[4]  Jianlin Xia,et al.  Superfast Multifrontal Method for Large Structured Linear Systems of Equations , 2009, SIAM J. Matrix Anal. Appl..

[5]  Patrick Amestoy,et al.  A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling , 2001, SIAM J. Matrix Anal. Appl..

[6]  Wolfgang Hackbusch,et al.  Construction and Arithmetics of H-Matrices , 2003, Computing.

[7]  Adrianna Gillman,et al.  Fast Direct Solvers for Elliptic Partial Differential Equations , 2011 .

[8]  L. Grasedyck,et al.  Domain-decomposition Based ℌ-LU Preconditioners , 2007 .

[9]  Patrick Amestoy,et al.  Hybrid scheduling for the parallel solution of linear systems , 2006, Parallel Comput..

[10]  P. Martinsson,et al.  An accelerated Poisson solver based on multidomain spectral discretization , 2016, BIT Numerical Mathematics.

[11]  Per-Gunnar Martinsson,et al.  A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method , 2013, J. Comput. Phys..

[12]  Shivkumar Chandrasekaran,et al.  A divide-and-conquer algorithm for the eigendecomposition of symmetric block-diagonal plus semiseparable matrices , 2004, Numerische Mathematik.

[13]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[14]  Samuel Williams,et al.  Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .

[15]  L. Trefethen Spectral Methods in MATLAB , 2000 .

[16]  Per-Gunnar Martinsson,et al.  A spectrally accurate direct solution technique for frequency-domain scattering problems with variable media , 2013, 1308.5998.

[17]  Pieter Ghysels,et al.  A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization , 2015, ACM Trans. Math. Softw..

[18]  Xiaoye S. Li,et al.  SuperLU Users'' Guide , 1997 .

[19]  James Demmel,et al.  SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems , 2003, TOMS.

[20]  Anshul Gupta A Shared- and distributed-memory parallel general sparse direct solver , 2007, Applicable Algebra in Engineering, Communication and Computing.

[21]  Wolfgang Hackbusch,et al.  A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices , 1999, Computing.

[22]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[23]  Frederico Pratas,et al.  Cache-aware Roofline model: Upgrading the loft , 2014, IEEE Computer Architecture Letters.

[24]  E. Wilson The static condensation algorithm , 1974 .

[25]  I. Duff,et al.  Direct Methods for Sparse Matrices , 1987 .

[26]  Adrianna Gillman,et al.  An Adaptive High Order Direct Solution Technique for Elliptic Boundary Value Problems , 2017, SIAM J. Sci. Comput..

[27]  James Demmel,et al.  An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination , 1997, SIAM J. Matrix Anal. Appl..

[28]  Per-Gunnar Martinsson,et al.  A direct solver with O(N) complexity for integral equations on one-dimensional domains , 2011, 1105.5372.

[29]  Lexing Ying,et al.  A fast direct solver for elliptic problems on general meshes in 2D , 2012, J. Comput. Phys..

[30]  S. Chandrasekaran,et al.  Algorithms to solve hierarchically semi-separable systems , 2007 .

[31]  Per-Gunnar Martinsson,et al.  A Fast Direct Solver for a Class of Elliptic Partial Differential Equations , 2009, J. Sci. Comput..