Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization

SuiteSparseQR is a sparse QR factorization package based on the multifrontal method. Within each frontal matrix, LAPACK and the multithreaded BLAS enable the method to obtain high performance on multicore architectures. Parallelism across different frontal matrices is handled with Intel's Threading Building Blocks library. The symbolic analysis and ordering phase pre-eliminates singletons by permuting the input matrix A into the form [R11 R12; 0 A22] where R11 is upper triangular with diagonal entries above a given tolerance. Next, the fill-reducing ordering, column elimination tree, and frontal matrix structures are found without requiring the formation of the pattern of ATA. Approximate rank-detection is performed within each frontal matrix using Heath's method. While Heath's method is not always exact, it has the advantage of not requiring column pivoting and thus does not interfere with the fill-reducing ordering. For sufficiently large problems, the resulting sparse QR factorization obtains a substantial fraction of the theoretical peak performance of a multicore computer.

[1]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[2]  W. Givens Computation of Plain Unitary Rotations Transforming a General Matrix to Triangular Form , 1958 .

[3]  John K. Reid,et al.  The Multifrontal Solution of Indefinite Sparse Symmetric Linear , 1983, TOMS.

[4]  Timothy A. Davis,et al.  A column pre-ordering strategy for the unsymmetric-pattern multifrontal method , 2004, TOMS.

[5]  YANQING CHEN,et al.  Algorithm 8 xx : CHOLMOD , supernodal sparse Cholesky factorization and update / downdate ∗ , 2006 .

[6]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[7]  C. Loan,et al.  A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[8]  Timothy A. Davis,et al.  A column approximate minimum degree ordering algorithm , 2000, TOMS.

[9]  Timothy A. Davis,et al.  An Unsymmetric-pattern Multifrontal Method for Sparse Lu Factorization , 1993 .

[10]  Ove Edlund A software package for sparse orthogonal factorization and updating , 2002, TOMS.

[11]  A. George,et al.  Symbolic factorization for sparse Gaussian elimination with partial pivoting , 1987 .

[12]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[13]  James Reinders,et al.  Intel® threading building blocks , 2008 .

[14]  Patrick Amestoy,et al.  Vectorization of a Multiprocessor Multifrontal Code , 1989, Int. J. High Perform. Comput. Appl..

[15]  Joseph W. H. Liu,et al.  The multifrontal method and paging in sparse Cholesky factorization , 1989, TOMS.

[16]  James Reinders,et al.  Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .

[17]  Al Geist,et al.  Task scheduling for parallel sparse Cholesky factorization , 1990, International Journal of Parallel Programming.

[18]  John R. Gilbert,et al.  Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..

[19]  A. George,et al.  Solution of sparse linear least squares problems using givens rotations , 1980 .

[20]  Patrick Amestoy,et al.  An Unsymmetrized Multifrontal LU Factorization , 2000, SIAM J. Matrix Anal. Appl..

[21]  Xiaoye S. Li,et al.  Computing Row and Column Counts for Sparse QR and LU Factorization , 2001 .

[22]  Timothy A. Davis,et al.  Modifying a Sparse Cholesky Factorization , 1999, SIAM J. Matrix Anal. Appl..

[23]  Patrick Amestoy,et al.  Multifrontal QR Factorization in a Multiprocessor Environment , 1996, Numer. Linear Algebra Appl..

[24]  Barbara Chapman,et al.  Using OpenMP - portable shared memory parallel programming , 2007, Scientific and engineering computation.

[25]  Iain S. Duff,et al.  On Permutations to Block Triangular Form , 1977 .

[26]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[27]  Christian H. Bischof,et al.  The WY representation for products of householder matrices , 1985, PPSC.

[28]  Pontus Matstoms,et al.  Sparse QR factorization in MATLAB , 1994, TOMS.

[29]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[30]  Åke Björck,et al.  Numerical methods for least square problems , 1996 .

[31]  D. Rose,et al.  Generalized nested dissection , 1977 .

[32]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[33]  I. Duff,et al.  Multifrontal QR Factorization in a Multiprocessor Environment , 1996 .

[34]  Iain S. Duff,et al.  The Multifrontal Solution of Unsymmetric Sets of Linear Equations , 1984 .

[35]  Timothy A. Davis,et al.  Dynamic Supernodes in Sparse Cholesky Update/Downdate and Triangular Solves , 2009, TOMS.

[36]  A. George,et al.  A data structure for sparse QR and LU factorizations , 1988 .

[37]  Robert A. van de Geijn,et al.  High-performance implementation of the level-3 BLAS , 2008, TOMS.

[38]  Patrick R. Amestoy,et al.  An Approximate Minimum Degree Ordering Algorithm , 1996, SIAM J. Matrix Anal. Appl..

[39]  Iain S. Duff,et al.  On Algorithms for Obtaining a Maximum Transversal , 1981, TOMS.

[40]  J. G. Lewis,et al.  Sparse Multifrontal Rank Revealing QR Factorization , 1997 .

[41]  E. Ng,et al.  On the row merge tree for sparse LU factorization with partial pivoting , 2007 .

[42]  Joseph W. H. Liu On general row merging schemes for sparse given transformations , 1986 .

[43]  Jesse L. Barlow,et al.  Multifrontal Computation with the Orthogonal Factors of Sparse Matrices , 1996, SIAM J. Matrix Anal. Appl..

[44]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[45]  Alston S. Householder,et al.  Unitary Triangularization of a Nonsymmetric Matrix , 1958, JACM.

[46]  Pontus Matstoms,et al.  Parallel Sparse QR Factorization on Shared Memory Architectures , 1995, Parallel Comput..

[47]  Jack Dongarra,et al.  ScaLAPACK user's guide , 1997 .

[48]  Timothy A. Davis,et al.  Algorithm 837: AMD, an approximate minimum degree ordering algorithm , 2004, TOMS.

[49]  Barbara Chapman,et al.  Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation) , 2007 .

[50]  Timothy A. Davis,et al.  Multiple-Rank Modifications of a Sparse Cholesky Factorization , 2000, SIAM J. Matrix Anal. Appl..

[51]  Jack Dongarra,et al.  LAPACK Users' Guide, 3rd ed. , 1999 .

[52]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[53]  Timothy A. Davis,et al.  Multifrontral multithreaded rank-revealing sparse QR factorization , 2009, Combinatorial Scientific Computing.

[54]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[55]  Alan George,et al.  Computer Solution of Large Sparse Positive Definite , 1981 .

[56]  John R. Gilbert,et al.  Predicting fill for sparse orthogonal factorization , 1986, JACM.

[57]  Timothy A. Davis,et al.  Algorithm 836: COLAMD, a column approximate minimum degree ordering algorithm , 2004, TOMS.

[58]  Chunguang Sun Parallel Sparse Orthogonal Factorization on Distributed-Memory Multiprocessors , 1996, SIAM J. Sci. Comput..

[59]  M. Heath Some Extensions of an Algorithm for Sparse Linear Least Squares Problems , 1982 .

[60]  Suely Oliveira,et al.  Exact Prediction of QR Fill-In by Row-Merge Trees , 2000, SIAM J. Sci. Comput..

[61]  Alex Pothen,et al.  Computing the block triangular form of a sparse matrix , 1990, TOMS.

[62]  J. G. Lewis,et al.  Incremental condition estimation for sparse matrices , 1990 .

[63]  Roger Grimes,et al.  The influence of relaxed supernode partitions on the multifrontal method , 1989, TOMS.

[64]  D. Sorensen,et al.  A pipelined givens method for computing the QR factorization of a sparse matrix , 1986 .

[65]  Gene H. Golub,et al.  Numerical methods for solving linear least squares problems , 1965, Milestones in Matrix Computation.

[66]  Iain S. Duff,et al.  MA57---a code for the solution of sparse symmetric definite and indefinite systems , 2004, TOMS.