论文信息 - Probabilistic Interpretation of Linear Solvers - 字舞流文

Probabilistic Interpretation of Linear Solvers

This manuscript proposes a probabilistic framework for algorithms that iteratively solve unconstrained linear problems $Bx = b$ with positive definite $B$ for $x$. The goal is to replace the point estimates returned by existing methods with a Gaussian posterior belief over the elements of the inverse of $B$, which can be used to estimate errors. Recent probabilistic interpretations of the secant family of quasi-Newton optimization algorithms are extended. Combined with properties of the conjugate gradient algorithm, this leads to uncertainty-calibrated methods with very limited cost overhead over conjugate gradients, a self-contained novel interpretation of the quasi-Newton and conjugate gradient algorithms, and a foundation for new nonlinear optimization methods.

Philipp Hennig | Philipp Hennig

[1] Michael L. Overton,et al. Primal-Dual Interior-Point Methods for Semidefinite Programming: Convergence Rates, Stability and Numerical Results , 1998, SIAM J. Optim..

[2] Samuel D. Conte,et al. Elementary Numerical Analysis: An Algorithmic Approach , 1975 .

[3] Martin Kiefel,et al. Quasi-Newton Methods: A New Direction , 2012, ICML.

[4] H. Walker. Quasi-Newton Methods , 1978 .

[5] Iain Murray. Introduction To Gaussian Processes , 2008 .

[6] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[7] L. Lecam. Convergence of Estimates Under Dimensionality Restrictions , 1973 .

[8] Larry Nazareth,et al. A family of variable metric updates , 1977, Math. Program..

[9] C. G. Broyden. A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .

[10] C. M. Reeves,et al. Function minimization by conjugate gradients , 1964, Comput. J..

[11] Roger Fletcher,et al. A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..

[12] S. Vajda,et al. Numerical Methods for Non-Linear Optimization , 1973 .

[13] J. J. Moré,et al. Quasi-Newton Methods, Motivation and Theory , 1974 .

[14] M. Powell. A New Algorithm for Unconstrained Optimization , 1970 .

[15] D. Gay,et al. Some Convergence Properties of Broyden&Apos;S Method , 1977 .

[16] William C. Davidon,et al. Optimally conditioned optimization algorithms without line searches , 1975, Math. Program..

[17] S. Gupta,et al. Statistical decision theory and related topics IV , 1988 .

[18] Hector J. Martinez. Local and Superlinear Convergence of Structural Secant Methods from the Convex Class , 1988 .

[19] Fuzhen Zhang. The Schur complement and its applications , 2005 .

[20] P. Diaconis,et al. The Subgroup Algorithm for Generating Uniform Random Variables , 1987, Probability in the Engineering and Informational Sciences.

[21] L. C W. Dixon,et al. Quasi-newton algorithms generate identical points , 1972, Math. Program..

[22] H. Walker,et al. Convergence Theorems for Least-Change Secant Update Methods, , 1981 .

[23] Klaus Ritter,et al. Bayesian numerical analysis , 2000 .

[24] R. Schnabel,et al. Least Change Secant Updates for Quasi-Newton Methods , 1978 .

[25] Shirley Dex,et al. JR 旅客販売総合システム（マルス）における運用及び管理について , 1991 .

[26] R. Muirhead. Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[27] D. J. Bell,et al. Numerical Methods for Unconstrained Optimization , 1979 .

[28] L. Nazareth. A Relationship between the BFGS and Conjugate Gradient Algorithms and Its Implications for New Algorithms , 1979 .

[29] D. Shanno. Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[30] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[31] L. C. W. Dixon,et al. Quasi Newton techniques generate identical points II: The proofs of four new theorems , 1972, Math. Program..

[32] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .

[33] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[34] William C. Davidon,et al. Variable Metric Method for Minimization , 1959, SIAM J. Optim..

[35] R. Fletcher,et al. A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[36] Philipp Hennig,et al. Fast Probabilistic Optimization from Noisy Gradients , 2013, ICML.

[37] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[38] C. G. Broyden. Quasi-Newton methods and their application to function minimisation , 1967 .

[39] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[40] C. Loan. The ubiquitous Kronecker product , 2000 .

[41] J. Greenstadt. Variations on Variable-Metric Methods , 1970 .