Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix

We analyze the convergence of randomized trace estimators. Starting at 1989, several algorithms have been proposed for estimating the trace of a matrix by 1/M∑<sub>i</sub>=1<sup>M</sup> z<sub><i>i</i></sub><sup><i>T</i></sup> <i>Az</i><sub><i>i</i></sub>, where the <i>z</i><sub><i>i</i></sub> are random vectors; different estimators use different distributions for the <i>z</i><sub><i>i</i></sub>s, all of which lead to <i>E</i>(1/M∑<sub><i>i</i></sub>=1<sup><i>M</i></sup> <i>z</i><sub><i>i</i></sub><sup>T</sup> <i>Az</i><sub><i>i</i></sub>) = trace(<i>A</i>). These algorithms are useful in applications in which there is no explicit representation of <i>A</i> but rather an efficient method compute <i>z</i><sup>T</sup><i>Az</i> given <i>z</i>. Existing results only analyze the variance of the different estimators. In contrast, we analyze the number of samples <i>M</i> required to guarantee that with probability at least 1−Δ, the relative error in the estimate is at most &epsis;. We argue that such bounds are much more useful in applications than the variance. We found that these bounds rank the estimators differently than the variance; this suggests that minimum-variance estimators may not be the best. We also make two additional contributions to this area. The first is a specialized bound for projection matrices, whose trace (rank) needs to be computed in electronic structure calculations. The second is a new estimator that uses less randomness than all the existing estimators.

[1]  Alan J. Laub,et al.  Small-Sample Statistical Estimates for Matrix Norms , 1995, SIAM J. Matrix Anal. Appl..

[2]  Y. Saad,et al.  An estimator for the diagonal of a matrix , 2007 .

[3]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[4]  J. S. Hunter,et al.  Statistics for experimenters : an introduction to design, data analysis, and model building , 1979 .

[5]  B. Zwart,et al.  Gaussian expansions and bounds for the Poisson distribution applied to the Erlang B formula , 2008, Advances in Applied Probability.

[6]  M. M. Menon,et al.  Computing Partial Eigenvalue Sum in Electronic Structure Calculations , 1998 .

[7]  Sivan Toledo,et al.  Blendenpik: Supercharging LAPACK's Least-Squares Solver , 2010, SIAM J. Sci. Comput..

[8]  Wang Calculating the density of states and optical-absorption spectra of large quantum systems by the plane-wave moments method. , 1994, Physical review. B, Condensed matter.

[9]  David Alan Drabold,et al.  Maximum entropy approach for linear scaling in the electronic structure problem. , 1993, Physical review letters.

[10]  Alan J. Laub,et al.  Statistical Condition Estimation for Linear Systems , 1998, SIAM J. Sci. Comput..

[11]  R. Silver,et al.  Calculation of densities of states and spectral functions by Chebyshev recursion and maximum entropy , 1997, cond-mat/9703229.

[12]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[13]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[14]  T. Iitaka,et al.  Random phase vector for calculating the trace of a large matrix. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  J. Wheeler,et al.  Modified Moments for Harmonic Solids , 1972 .

[16]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[17]  G. Golub,et al.  Some large-scale matrix computation problems , 1996 .

[18]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[19]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[20]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[21]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[22]  P. A. P. Moran,et al.  An introduction to probability theory , 1968 .

[23]  Kenneth Ward Church,et al.  Nonlinear Estimators and Tail Bounds for Dimension Reduction in l1 Using Cauchy Random Projections , 2006, J. Mach. Learn. Res..

[24]  D. L. Wallace Bounds on Normal Approximations to Student's and the Chi-Square Distributions , 1959 .