Improving multifrontal solvers by means of algebraic Block Low-Rank representations. (Amélioration des solveurs multifrontaux à l'aide de représentations algébriques rang-faible par blocs)

We consider the solution of large sparse linear systems by means of direct factorization based on a multifrontal approach. Although numerically robust and easy to use (it only needs algebraic information: the input matrix A and a right-hand side b, even if it can also digest preprocessing strategies based on geometric information), direct factorization methods are computationally intensive both in terms of memory and operations, which limits their scope on very large problems (matrices with up to few hundred millions of equations). This work focuses on exploiting low-rank approximations on multifrontal based direct methods to reduce both the memory footprints and the operation count, in sequential and distributed-memory environments, on a wide class of problems. We first survey the low-rank formats which have been previously developed to efficiently represent dense matrices and have been widely used to design fast solutions of partial differential equations, integral equations and eigenvalue problems. These formats are hierarchical (H and Hierarchically Semiseparable matrices are the most common ones) and have been (both theoretically and practically) shown to substantially decrease the memory and operation requirements for linear algebra computations. However, they impose many structural constraints which can limit their scope and efficiency, especially in the context of general purpose multifrontal solvers. We propose a flat format called Block Low-Rank (BLR) based on a natural blocking of the matrices and explain why it provides all the flexibility needed by a general purpose multifrontal solver in terms of numerical pivoting for stability and parallelism. We compare BLR format with other formats and show that BLR does not compromise much the memory and operation improvements achieved through low-rank approximations. A stability study shows that the approximations are well controlled by an explicit numerical parameter called low-rank threshold, which is critical in order to solve the sparse linear system accurately. Details on how Block Low-Rank factorizations can be efficiently implemented within multifrontal solvers are then given. We propose several Block Low-Rank factorization algorithms which allow for different types of gains. The proposed algorithms have been implemented within the MUMPS (MUltifrontal Massively Parallel Solver) solver. We first report experiments on standard partial differential equations based problems to analyse the main features of our BLR algorithms and to show the potential and flexibility of the approach; a comparison with a Hierarchically SemiSeparable code is also given. Then, Block Low-Rank formats are experimented on large (up to a hundred millions of unknowns) and various problems coming from several industrial applications. We finally illustrate the use of our approach as a preconditioning method for the Conjugate Gradient.

[1]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[2]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[3]  S. Operto,et al.  3D finite-difference frequency-domain modeling of visco-acoustic wave propagation using a massively parallel direct solver: A feasibility study , 2007 .

[4]  Patrick Amestoy,et al.  Hybrid scheduling for the parallel solution of linear systems , 2006, Parallel Comput..

[5]  Bora Uçar,et al.  A Parallel Matrix Scaling Algorithm , 2008, VECPAR.

[6]  Jean Virieux,et al.  Velocity model building by 3D frequency-domain, full-waveform inversion of wide-aperture seismic data , 2008 .

[7]  Roger Grimes,et al.  The influence of relaxed supernode partitions on the multifrontal method , 1989, TOMS.

[8]  James Demmel,et al.  An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination , 1997, SIAM J. Matrix Anal. Appl..

[9]  Per-Gunnar Martinsson,et al.  A Fast Direct Solver for a Class of Elliptic Partial Differential Equations , 2009, J. Sci. Comput..

[10]  Timothy A. Davis,et al.  A combined unifrontal/multifrontal method for unsymmetric sparse matrices , 1999, TOMS.

[11]  Per-Gunnar Martinsson,et al.  A direct solver with O(N) complexity for integral equations on one-dimensional domains , 2011, 1105.5372.

[12]  D. Ruiz A Scaling Algorithm to Equilibrate Both Rows and Columns Norms in Matrices 1 , 2001 .

[13]  Jean-Yves L'Excellent,et al.  Introduction of shared-memory parallelism in a distributed-memory multifrontal solver , 2013 .

[14]  Robert Schreiber,et al.  A New Implementation of Sparse Gaussian Elimination , 1982, TOMS.

[15]  F. Gantmacher,et al.  Oscillation matrices and kernels and small vibrations of mechanical systems , 1961 .

[16]  James Demmel,et al.  ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, PARA.

[17]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[18]  Wolfgang Hackbusch,et al.  A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices , 1999, Computing.

[19]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[20]  J. Bunch,et al.  Some stable methods for calculating inertia and solving symmetric linear systems , 1977 .

[21]  Eric Darve,et al.  The black-box fast multipole method , 2009, J. Comput. Phys..

[22]  James Demmel,et al.  A Supernodal Approach to Sparse Partial Pivoting , 1999, SIAM J. Matrix Anal. Appl..

[23]  Alle-Jan van der Veen,et al.  Fast Stable Solver for Sequentially Semi-separable Linear Systems of Equations , 2002, HiPC.

[24]  D. Rose,et al.  Generalized nested dissection , 1977 .

[25]  James Demmel,et al.  Stability of block LU factorization , 1992, Numer. Linear Algebra Appl..

[26]  Romain Brossier,et al.  Performances of 3D Frequency-Domain Full-Waveform Inversion Based on Frequency-Domain Direct-Solver and Time-Domain Modeling: Application to 3D OBC Data from the Valhall Field , 2013 .

[27]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[28]  François-Henry Rouet,et al.  Memory and performance issues in parallel multifrontal factorizations and triangular solutions with sparse right-hand sides. (Problèmes de mémoire et de performance de la factorisation multifrontale parallèle et de la résolution triangulaire à seconds membres creux) , 2012 .

[29]  Jianlin Xia,et al.  On 3D modeling of seismic wave propagation via a structured parallel multifrontal direct Helmholtz solver , 2011 .

[30]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[31]  Timothy A. Davis,et al.  Algorithm 832: UMFPACK V4.3---an unsymmetric-pattern multifrontal method , 2004, TOMS.

[32]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[33]  W. Hackbusch,et al.  Introduction to Hierarchical Matrices with Applications , 2003 .

[34]  G. Golub,et al.  A bibliography on semiseparable matrices* , 2005 .

[35]  Patrick Amestoy,et al.  A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling , 2001, SIAM J. Matrix Anal. Appl..

[36]  Patrick R. Amestoy,et al.  Analysis and comparison of two general sparse solvers for distributed memory computers , 2001, TOMS.

[37]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[38]  François Pellegrini,et al.  PT-Scotch: A tool for efficient parallel graph ordering , 2008, Parallel Comput..

[39]  R. Pratt Seismic waveform inversion in the frequency domain; Part 1, Theory and verification in a physical scale model , 1999 .

[40]  S. Börm Efficient Numerical Methods for Non-local Operators , 2010 .

[41]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[42]  B. Engquist,et al.  Sweeping preconditioner for the Helmholtz equation: Hierarchical matrix representation , 2010, 1007.4290.

[43]  Gene H. Golub,et al.  Matrix computations , 1983 .

[44]  Gil Utard,et al.  Impact of reordering on the memory of a multifrontal solver , 2003, Parallel Comput..

[45]  Joseph W. H. Liu The role of elimination trees in sparse factorization , 1990 .

[46]  Mario Bebendorf,et al.  Hierarchical Matrices: A Means to Efficiently Solve Elliptic Boundary Value Problems , 2008 .

[47]  Ronald Kriemann,et al.  Parallel black box $$\mathcal {H}$$-LU preconditioning for elliptic boundary value problems , 2008 .

[48]  Patrick Amestoy,et al.  Vectorization of a Multiprocessor Multifrontal Code , 1989, Int. J. High Perform. Comput. Appl..

[49]  Patrick R. Amestoy,et al.  An unsymmetrized multifrontal LU factorization , 2000 .

[50]  James Demmel,et al.  Stability of block algorithms with fast level-3 BLAS , 1992, TOMS.

[51]  Jianlin Xia,et al.  Efficient scalable algorithms for hierarchically semiseparable matrices , 2011 .

[52]  Jianlin Xia,et al.  Efficient Structured Multifrontal Factorization for General Large Sparse Matrices , 2013, SIAM J. Sci. Comput..

[53]  Joseph W. H. Liu,et al.  Elimination Structures for Unsymmetric Sparse $LU$ Factors , 1993, SIAM J. Matrix Anal. Appl..

[54]  James Demmel,et al.  SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems , 2003, TOMS.

[55]  James Demmel,et al.  Making Sparse Gaussian Elimination Scalable by Static Pivoting , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[56]  Jürgen Schulze Towards a Tighter Coupling of Bottom-Up and Top-Down Sparse Matrix Ordering Methods , 2001 .

[57]  Jean Virieux,et al.  An overview of full-waveform inversion in exploration geophysics , 2009 .

[58]  George Karypis,et al.  Parmetis parallel graph partitioning and sparse matrix ordering library , 1997 .

[59]  R. Vandebril,et al.  Matrix Computations and Semiseparable Matrices: Linear Systems , 2010 .

[60]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[61]  Thomas Kailath,et al.  Linear complexity algorithms for semiseparable matrices , 1985 .

[62]  M. N. Toksoz,et al.  Seismic wave attenuation , 1981 .

[63]  Jianlin Xia,et al.  Superfast Multifrontal Method for Large Structured Linear Systems of Equations , 2009, SIAM J. Matrix Anal. Appl..

[64]  Anshul Gupta,et al.  Recent advances in direct methods for solving unsymmetric sparse systems of linear equations , 2002, TOMS.

[65]  Wayne Barrett A theorem on inverse of tridiagonal matrices , 1979 .

[66]  J. Demmel,et al.  Solving Sparse Linear Systems with Sparse Backward Error , 2015 .

[67]  Shivkumar Chandrasekaran,et al.  On the Numerical Rank of the Off-Diagonal Blocks of Schur Complements of Discretized Elliptic PDEs , 2010, SIAM J. Matrix Anal. Appl..

[68]  Jianlin Xia,et al.  Fast algorithms for hierarchically semiseparable matrices , 2010, Numer. Linear Algebra Appl..

[69]  Michael T. Heath,et al.  Parallel Algorithms for Sparse Linear Systems , 1991, SIAM Rev..

[70]  E. Ng,et al.  Predicting structure in nonsymmetric sparse matrix factorizations , 1993 .

[71]  James Demmel,et al.  Parallel Symbolic Factorization for Sparse LU with Static Pivoting , 2007, SIAM J. Sci. Comput..

[72]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[73]  J. Bunch,et al.  Direct Methods for Solving Symmetric Indefinite Systems of Linear Equations , 1971 .

[74]  Joseph W. H. Liu,et al.  Algorithmic Aspects of Elimination Trees for Sparse Unsymmetric Matrices , 2007, SIAM J. Matrix Anal. Appl..

[75]  Joseph W. H. Liu,et al.  The Theory of Elimination Trees for Sparse Unsymmetric Matrices , 2005, SIAM J. Matrix Anal. Appl..

[76]  Wolfgang Hackbusch,et al.  Construction and Arithmetics of H-Matrices , 2003, Computing.

[77]  Adrianna Gillman,et al.  Fast Direct Solvers for Elliptic Partial Differential Equations , 2011 .

[78]  John K. Reid,et al.  Exploiting zeros on the diagonal in the direct solution of indefinite sparse symmetric linear systems , 1996, TOMS.

[79]  Jianlin Xia,et al.  Massively parallel structured multifrontal solver for time-harmonic elastic waves in 3-D anisotropic media , 2012 .

[80]  Xiaoye S. Li,et al.  Direction-Preserving and Schur-Monotonic Semiseparable Approximations of Symmetric Positive Definite Matrices , 2009, SIAM J. Matrix Anal. Appl..

[81]  Patrick Amestoy,et al.  Memory Management Issues in Sparse Multifrontal Methods On Multiprocessors , 1993, Int. J. High Perform. Comput. Appl..

[82]  J. L. Rigal,et al.  On the Compatibility of a Given Solution With the Data of a Linear System , 1967, JACM.

[83]  Patrick R. Amestoy,et al.  MUMPS MUltifrontal Massively Parallel Solver Version 2.0 , 1998 .

[84]  Robert D. Skeel,et al.  Scaling for Numerical Stability in Gaussian Elimination , 1979, JACM.

[85]  Patrick R. Amestoy,et al.  An Approximate Minimum Degree Ordering Algorithm , 1996, SIAM J. Matrix Anal. Appl..

[86]  Jianlin Xia,et al.  Randomized Sparse Direct Solvers , 2013, SIAM J. Matrix Anal. Appl..

[87]  Jean Virieux,et al.  Frequency-Domain Numerical Modelling of Visco-Acoustic Waves Based on Finite-Difference and Finite-Element Discontinuous Galerkin Methods , 2010 .

[88]  W. Hackbusch A Sparse Matrix Arithmetic Based on $\Cal H$-Matrices. Part I: Introduction to ${\Cal H}$-Matrices , 1999, Computing.

[89]  Joseph W. H. Liu,et al.  On the storage requirement in the out-of-core multifrontal method for sparse factorization , 1986, TOMS.

[90]  Jean Roman,et al.  SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.

[91]  John K. Reid,et al.  The Multifrontal Solution of Indefinite Sparse Symmetric Linear , 1983, TOMS.

[92]  Timothy A. Davis,et al.  A column pre-ordering strategy for the unsymmetric-pattern multifrontal method , 2004, TOMS.

[93]  Shivkumar Chandrasekaran,et al.  A Fast ULV Decomposition Solver for Hierarchically Semiseparable Representations , 2006, SIAM J. Matrix Anal. Appl..

[94]  Timothy A. Davis,et al.  An Unsymmetric-pattern Multifrontal Method for Sparse Lu Factorization , 1993 .

[95]  S. Vavasis,et al.  Geometric Separators for Finite-Element Meshes , 1998, SIAM J. Sci. Comput..

[96]  K. Marfurt Accuracy of finite-difference and finite-element modeling of the scalar and elastic wave equations , 1984 .

[97]  Patrick R. Amestoy,et al.  Multifrontal parallel distributed symmetric and unsymmetric solvers , 2000 .

[98]  Jean-Yves L'Excellent,et al.  Multifrontal Methods: Parallelism, Memory Usage and Numerical Aspects , 2012 .