A BLAS-3 Version of the QR Factorization with Column Pivoting

The QR factorization with column pivoting (QRP), originally suggested by Golub [Numer. Math., 7 (1965), 206--216], is a popular approach to computing rank-revealing factorizations. Using Level 1 BLAS, it was implemented in LINPACK, and, using Level 2 BLAS, in LAPACK. While the Level 2 BLAS version delivers superior performance in general, it may result in worse performance for large matrix sizes due to cache effects. We introduce a modification of the QRP algorithm which allows the use of Level 3 BLAS kernels while maintaining the numerical behavior of the LINPACK and LAPACK implementations. Experimental comparisons of this approach with the LINPACK and LAPACK implementations on IBM RS/6000, SGI R8000, and DEC AXP platforms show considerable performance improvements.

[1]  T. Chan Rank revealing QR factorizations , 1987 .

[2]  Jack Dongarra,et al.  LINPACK Users' Guide , 1987 .

[3]  Christian H. Bischof,et al.  Structure-Preserving and Rank-Revealing QR-Factorizations , 1991, SIAM J. Sci. Comput..

[4]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[5]  Thomas A. Grandine,et al.  An iterative method for computing multivariate C1 piecewise polynomial interpolants , 1987, Comput. Aided Geom. Des..

[6]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[7]  P. Tang,et al.  Bounds on Singular Values Revealed by QR Factorizations , 1999 .

[8]  Jack J. Dongarra,et al.  A proposal for a set of level 3 basic linear algebra subprograms , 1987, SGNM.

[9]  Christian H. Bischof,et al.  A Parallel QR Factorization Algorithm with Controlled Local Pivoting , 1991, SIAM J. Sci. Comput..

[10]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[11]  Bertil Waldén Using a Fast Signal Processor to Solve the Inverse Kinematic Problem , 1991 .

[12]  Per Christian Hansen,et al.  Truncated Singular Value Decomposition Solutions to Discrete Ill-Posed Problems with Ill-Determined Numerical Rank , 1990, SIAM J. Sci. Comput..

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  C. Loan,et al.  A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[15]  R. Schreiber,et al.  An aplicaiton of systolic arrays to linear discrete Ill posed problems , 1986 .

[16]  G. Golub,et al.  A comparison between some direct and iterative methods for certian large scale godetic least squares problems , 1986 .

[17]  Christian H. Bischof,et al.  A block QR factorization algorithm using restricted pivoting , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  Ilse C. F. Ipsen,et al.  On Rank-Revealing Factorisations , 1994, SIAM J. Matrix Anal. Appl..

[20]  Ming Gu,et al.  Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[21]  Christian H. Bischof,et al.  On updating signal subspaces , 1992, IEEE Trans. Signal Process..

[22]  James Demmel,et al.  Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I , 1993, PPSC.

[23]  H. Hotelling The relations of the newer multivariate statistical methods to factor analysis. , 1957 .