Stability of the Lanczos Method for Matrix Function Approximation

The ubiquitous Lanczos method can approximate $f(A)x$ for any symmetric $n \times n$ matrix $A$, vector $x$, and function $f$. In exact arithmetic, the method's error after $k$ iterations is bounded by the error of the best degree-$k$ polynomial uniformly approximating $f(x)$ on the range $[\lambda_{min}(A), \lambda_{max}(A)]$. However, despite decades of work, it has been unclear if this powerful guarantee holds in finite precision. We resolve this problem, proving that when $\max_{x \in [\lambda_{min}, \lambda_{max}]}|f(x)| \le C$, Lanczos essentially matches the exact arithmetic guarantee if computations use roughly $\log(nC\|A\|)$ bits of precision. Our proof extends work of Druskin and Knizhnerman [DK91], leveraging the stability of the classic Chebyshev recurrence to bound the stability of any polynomial approximating $f(x)$. We also study the special case of $f(A) = A^{-1}$, where stronger guarantees hold. In exact arithmetic Lanczos performs as well as the best polynomial approximating $1/x$ at each of $A$'s eigenvalues, rather than on the full eigenvalue range. In seminal work, Greenbaum gives an approach to extending this bound to finite precision: she proves that finite precision Lanczos and the related CG method match any polynomial approximating $1/x$ in a tiny range around each eigenvalue [Gre89]. For $A^{-1}$, this bound appears stronger than ours. However, we exhibit matrices with condition number $\kappa$ where exact arithmetic Lanczos converges in $polylog(\kappa)$ iterations, but Greenbaum's bound predicts $\Omega(\kappa^{1/5})$ iterations. It thus cannot offer significant improvement over the $O(\kappa^{1/2})$ bound achievable via our result. Our analysis raises the question of if convergence in less than $poly(\kappa)$ iterations can be expected in finite precision, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions.

[1]  Sanjeev Arora,et al.  Fast algorithms for approximate semidefinite programming using the multiplicative weights update method , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[2]  Harlan D. Mills,et al.  Rounding errors in algebraic processes , 1964, IFIP Congress.

[3]  N. Higham Functions Of Matrices , 2008 .

[4]  Aaron Sidford,et al.  Principal Component Projection Without Principal Component Analysis , 2016, ICML.

[5]  David P. Woodruff,et al.  Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness , 2017, ITCS.

[6]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[7]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[8]  Sanjeev Arora,et al.  A combinatorial, primal-dual approach to semidefinite programs , 2007, STOC.

[9]  Yuanzhi Li,et al.  Faster Principal Component Regression and Stable Matrix Chebyshev Approximation , 2017, ICML.

[10]  Nico M. Temme,et al.  Numerical methods for special functions , 2007 .

[11]  Maria-Florina Balcan,et al.  An Improved Gap-Dependency Analysis of the Noisy Power Method , 2016, COLT.

[12]  Jinwoo Shin,et al.  Approximating Spectral Sums of Large-Scale Matrices using Stochastic Chebyshev Approximations , 2017, SIAM J. Sci. Comput..

[13]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[14]  M. J. D. Powell,et al.  On the Maximum Errors of Polynomial Approximations Defined by Interpolation and by Least Squares Criteria , 1967, Comput. J..

[15]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[16]  Edoardo Di Napoli,et al.  Efficient estimation of eigenvalue counts in an interval , 2013, Numer. Linear Algebra Appl..

[17]  Moritz Hardt,et al.  The Noisy Power Method: A Meta Algorithm with Applications , 2013, NIPS.

[18]  Richard Peng,et al.  Faster Algorithms for Computing the Stationary Distribution, Simulating Random Walks, and More , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[19]  G. Meurant,et al.  The Lanczos and conjugate gradient algorithms in finite precision arithmetic , 2006, Acta Numerica.

[20]  Anne Greenbaum,et al.  Using Nonorthogonal Lanczos Vectors in the Computation of Matrix Functions , 1998, SIAM J. Sci. Comput..

[21]  L. Knizhnerman,et al.  Error bounds in the simple Lanczos procedure for computing functions of symmetric matrices and eigenvalues , 1992 .

[22]  Stanley C. Eisenstat,et al.  A Divide-and-Conquer Algorithm for the Symmetric Tridiagonal Eigenproblem , 1995, SIAM J. Matrix Anal. Appl..

[23]  Y. Saad Numerical Methods for Large Eigenvalue Problems , 2011 .

[24]  John Dunagan,et al.  Iteratively constructing preconditioners via the conjugate gradient method , 2007, STOC '07.

[25]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[26]  H. Simon The Lanczos algorithm with partial reorthogonalization , 1984 .

[27]  Daniel A. Spielman,et al.  A Note on Preconditioning by Low-Stretch Spanning Trees , 2009, ArXiv.

[28]  Y. Saad Analysis of some Krylov subspace approximations to the matrix exponential operator , 1992 .

[29]  Yousef Saad,et al.  Fast methods for estimating the Numerical rank of large matrices , 2016, ICML.

[30]  Cameron Musco,et al.  Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.

[31]  Christopher C. Paige,et al.  The computation of eigenvalues and eigenvectors of very large sparse matrices , 1971 .

[32]  Vladimir Druskin,et al.  Krylov subspace approximation of eigenpairs and matrix functions in exact and computer arithmetic , 1995, Numer. Linear Algebra Appl..

[33]  B. Parlett,et al.  The Lanczos algorithm with selective orthogonalization , 1979 .

[34]  A. Greenbaum Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences , 1989 .

[35]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[36]  Nisheeth K. Vishnoi,et al.  Faster Algorithms via Approximation Theory , 2014, Found. Trends Theor. Comput. Sci..

[37]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[38]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[39]  O. Axelsson,et al.  On the rate of convergence of the preconditioned conjugate gradient method , 1986 .

[40]  C. W. Clenshaw A note on the summation of Chebyshev series , 1955 .

[41]  C. Paige Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem , 1980 .

[42]  Nisheeth K. Vishnoi,et al.  Approximating the exponential, the lanczos method and an Õ(m)-time spectral algorithm for balanced separator , 2011, STOC '12.

[43]  Satyen Kale Efficient algorithms using the multiplicative weights update method , 2007 .

[44]  Pierre Vandergheynst,et al.  Compressive Spectral Clustering , 2016, ICML.

[45]  J. Bunch,et al.  Rank-one modification of the symmetric eigenproblem , 1978 .

[46]  Cleve B. Moler,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later , 1978, SIAM Rev..

[47]  C. Paige Error Analysis of the Lanczos Algorithm for Tridiagonalizing a Symmetric Matrix , 1976 .

[48]  Yuanzhi Li,et al.  Even Faster SVD Decomposition Yet Without Agonizing Pain , 2016, NIPS.

[49]  Jinwoo Shin,et al.  Large-scale log-determinant computation through stochastic Chebyshev expansions , 2015, ICML.