Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization

We design and develop a new high performance implementation of a fast direct LU-based solver using low-rank approximations on massively parallel systems. The LU factorization is the most time-consuming step in solving systems of linear equations in the context of analyzing acoustic scattering from large 3D objects. The matrix equation is obtained by discretizing the boundary integral of the exterior Helmholtz problem using a higher-order Nyström scheme. The main idea is to exploit the inherent data sparsity of the matrix operator by performing local tile-centric approximations while still capturing the most significant information. In particular, the proposed LU-based solver leverages the Tile Low-Rank (TLR) data compression format as implemented in the Hierarchical Computations on Manycore Architectures (HiCMA) library to decrease the complexity of “classical” dense direct solvers from cubic to quadratic order. We taskify the underlying boundary integral kernels to expose fine-grained computations. We then employ the dynamic runtime system StarPU to orchestrate the scheduling of computational tasks on shared and distributed-memory systems. The resulting asynchronous execution permits to compensate for the load imbalance due to the heterogeneous ranks, while mitigating the overhead of data motion. We assess the robustness of our TLR LU-based solver and study the qualitative impact when using different numerical accuracies. The new TLR LU factorization outperforms the state-of-the-art dense factorizations by up to an order of magnitude on various parallel systems, for analysis of scattering from large-scale 3D synthetic and real geometries.

[1]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[2]  Per-Gunnar Martinsson,et al.  A high-order accurate accelerated direct solver for acoustic scattering from surfaces , 2013 .

[3]  Jun Hu,et al.  A Butterfly-Based Direct Integral-Equation Solver Using Hierarchical LU Factorization for Analyzing Scattering From Electrically Large Conducting Objects , 2016, IEEE Transactions on Antennas and Propagation.

[4]  S Quintana-OrtíEnrique,et al.  Programming matrix algorithms-by-blocks for thread-level parallelism , 2009 .

[5]  Sadasiva Rao,et al.  Elimination of internal resonance problem associated with acoustic scattering by three-dimensional rigid body , 2004 .

[6]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[7]  Thomas Hérault,et al.  Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[8]  J. Parrón,et al.  Multiscale Compressed Block Decomposition for Fast Direct Solution of Method of Moments Linear System , 2011, IEEE Transactions on Antennas and Propagation.

[9]  Patrick R. Amestoy,et al.  Bridging the Gap Between Flat and Hierarchical Low-Rank Matrix Formats: The Multilevel Block Low-Rank Format , 2019, SIAM J. Sci. Comput..

[10]  Paul Fischer,et al.  PROJECTION TECHNIQUES FOR ITERATIVE SOLUTION OF Ax = b WITH SUCCESSIVE RIGHT-HAND SIDES , 1993 .

[11]  Julien Langou,et al.  A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..

[12]  Steffen Börm,et al.  Data-sparse Approximation by Adaptive ℋ2-Matrices , 2002, Computing.

[13]  Jun Hu,et al.  Fast Direct Solution of Integral Equations With Modified HODLR Structure for Analyzing Electromagnetic Scattering Problems , 2019, IEEE Transactions on Antennas and Propagation.

[14]  Per-Gunnar Martinsson,et al.  An O(N) Direct Solver for Integral Equations on the Plane , 2013, 1303.5466.

[15]  Jaume Sanz,et al.  Wide Area RTK: A satellite navigation system based on precise real‐time ionospheric modelling , 2012 .

[16]  Jian-Ming Jin,et al.  A novel grid-robust higher order vector basis function for the method of moments , 2000 .

[17]  S. Börm Efficient Numerical Methods for Non-local Operators , 2010 .

[18]  C. Farhat,et al.  Extending substructure based iterative solvers to multiple load and repeated analyses , 1994 .

[19]  Jean-Yves L'Excellent,et al.  Improving Multifrontal Methods by Means of Block Low-Rank Representations , 2015, SIAM J. Sci. Comput..

[20]  Dan Jiao,et al.  An LU Decomposition Based Direct Integral Equation Solver of Linear Complexity and Higher-Order Accuracy for Large-Scale Interconnect Extraction , 2010, IEEE Transactions on Advanced Packaging.

[21]  Sergej Rjasanow,et al.  Adaptive Low-Rank Approximation of Collocation Matrices , 2003, Computing.

[22]  F. Rizzo,et al.  A General Algorithm for the Numerical Solution of Hypersingular Boundary Integral Equations , 1992 .

[23]  P. Yla-Oijala,et al.  Singularity subtraction technique for high-order polynomial vector basis functions on planar triangles , 2006, IEEE Transactions on Antennas and Propagation.

[24]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[25]  M. G. Duffy,et al.  Quadrature Over a Pyramid or Cube of Integrands with a Singularity at a Vertex , 1982 .

[26]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[27]  Jin-Fa Lee,et al.  A fast direct matrix solver for surface integral equation methods for electromagnetic wave scattering from non-penetrable targets , 2012 .

[28]  Emmanuel Agullo,et al.  Comparative study of one-sided factorizations with multiple software packages on multi-core hardware , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[29]  Zaiping Nie,et al.  An MPI-OpenMP Hybrid Parallel -LU Direct Solver for Electromagnetic Integral Equations , 2015 .

[30]  David E. Keyes,et al.  Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures , 2017, ISC.

[31]  M. Bonnet Boundary Integral Equation Methods for Solids and Fluids , 1999 .

[32]  Eric Darve,et al.  A fast block low-rank dense solver with applications to finite-element matrices , 2014, J. Comput. Phys..

[33]  G. F. Miller,et al.  The application of integral equation methods to the numerical solution of some exterior boundary-value problems , 1971, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[34]  Wolfgang Hackbusch,et al.  A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices , 1999, Computing.

[35]  Eric Darve,et al.  An $$\mathcal O (N \log N)$$O(NlogN)  Fast Direct Solver for Partial Hierarchically Semi-Separable Matrices , 2013 .

[36]  Pieter Ghysels,et al.  A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization , 2015, ACM Trans. Math. Softw..

[37]  E. Nyström Über Die Praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben , 1930 .

[38]  Leslie Greengard,et al.  Fast Direct Methods for Gaussian Processes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Hatem Ltaief,et al.  Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications , 2020, PASC.

[40]  John J. Ottusch,et al.  Numerical Solution of the Helmholtz Equation in 2D and 3D Using a High-Order Nyström Discretization , 1998 .

[41]  James Bremer,et al.  A Nyström method for weakly singular integral operators on surfaces , 2012, J. Comput. Phys..

[42]  Alex Yu. Yeremin,et al.  Matrix-free iterative solution strategies for large dense linear systems , 1997, Numer. Linear Algebra Appl..

[43]  J. Shaeffer,et al.  Direct Solve of Electrically Large Integral Equations for Problem Sizes to 1 M Unknowns , 2008, IEEE Transactions on Antennas and Propagation.

[44]  David E. Keyes,et al.  Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering , 2018, SIAM J. Sci. Comput..

[45]  Jack Dongarra,et al.  Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .

[46]  R. Kanwal Linear Integral Equations , 1925, Nature.

[47]  David E. Keyes,et al.  Exploiting Data Sparsity for Large-Scale Matrix Computations , 2018, Euro-Par.