Stability of block algorithms with fast level-3 BLAS

Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar, they have a higher level of granularity than point algorithms, and this makes them well suited to high-performance computers. The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated here. It is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level-3 BLAS (BLAS3). One implication is that the block algorithms are as stable as the corresponding point algorithms when conventional BLAS3 are used. A second implication is that the use of BLAS3 based on fast matrix multiplication techniques affects the stability only insofar as it increases the constant terms in the normwise backward error bounds. For linear equation solvers employing LU factorization, it is shown that fixed precision iterative refinement helps to mitigate the effect of the larger error constants. Despite the positive results presented here, not all plausible block algorithms are stable; we illustrate this with the example of LU factorization with block triangular factors and describe how to check a block algorithm for stability without doing a full error analysis.

[1]  J. H. Wilkinson Error Analysis of Eigenvalue Techniques Based on Orthogonal Transformations , 1962 .

[2]  W. Prager,et al.  Compatibility of approximate solution of linear equations with given error bounds for coefficients and right-hand sides , 1964 .

[3]  J. H. Wilkinson The algebraic eigenvalue problem , 1966 .

[4]  Shmuel Winograd,et al.  A New Algorithm for Inner Product , 1968, IEEE Transactions on Computers.

[5]  V. Strassen Gaussian elimination is not optimal , 1969 .

[6]  R. Brent Error analysis of algorithms for matrix multiplication and triangular decomposition using Winograd's identity , 1970 .

[7]  R. Skeel Iterative refinement implies numerical stability for Gaussian elimination , 1980 .

[8]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[9]  Gene H. Golub,et al.  Matrix computations , 1983 .

[10]  Christian H. Bischof,et al.  The WY representation for products of householder matrices , 1985, PPSC.

[11]  L. Kaufman,et al.  Squeezing the most out of eigenvalue solvers on high-performance computers , 1986 .

[12]  William Jalby,et al.  Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .

[13]  Robert Schreiber,et al.  Block Algorithms for Parallel Machines , 1988 .

[14]  N. Higham How Accurate is Gaussian Elimination , 1989 .

[15]  James Demmel,et al.  On a Block Implementation of Hessenberg Multishift QR Iteration , 1989, Int. J. High Speed Comput..

[16]  N. Higham A Collection of Test Matrices in MATLAB , 1989 .

[17]  P. Mayes,et al.  LAPACK working note No. 12: Banded cholesky factorization using level 3 BLAS , 1989 .

[18]  C. Loan,et al.  A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[19]  Jack Dongarra,et al.  A Project for Developing a Linear Algebra Library for High-Performance Computers , 1989 .

[20]  J. Demmel,et al.  On Floating Point Errors in Cholesky , 1989 .

[21]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[22]  K. A. Gallivan,et al.  Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[23]  N. Higham Bounding the error in Gaussian Elimination for Tridiagonal systems , 1990 .

[24]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[25]  D. Sorensen,et al.  Block reduction of matrices to condensed forms for eigenvalue computations , 1990 .

[26]  Jack J. Dongarra,et al.  Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs , 1990, TOMS.

[27]  N. Higham Iterative refinement enhances the stability ofQR factorization methods for solving linear equations , 1991 .

[28]  Nicholas J. Higham,et al.  Algorithm 694: a collection of test matrices in MATLAB , 1991, TOMS.

[29]  Nicholas J. Higham Stability of a Method for Multiplying Complex Matrices with Three Real Matrix Multiplications , 1992, SIAM J. Matrix Anal. Appl..

[30]  N. Higham,et al.  Stability of methods for matrix inversion , 1992 .