Quasi-Newton Methods: A New Direction

Four decades after their invention, quasi-Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

[1]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations , 1970 .

[2]  J. J. Moré,et al.  Quasi-Newton Methods, Motivation and Theory , 1974 .

[3]  J. Greenstadt Variations on Variable-Metric Methods , 1970 .

[4]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[5]  J. D. Pearson Variable metric methods of minimisation , 1969, Comput. J..

[6]  Alan Genz,et al.  Numerical computation of rectangular bivariate and trivariate normal and t probabilities , 2004, Stat. Comput..

[7]  C. G. Broyden A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .

[8]  D. Goldfarb A family of variable-metric methods derived by variational means , 1970 .

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Larry Nazareth,et al.  A family of variable metric updates , 1977, Math. Program..

[11]  Richard P. Savage The space of positive definite matrices and Gromov’s invariant , 1982 .

[12]  Roger Fletcher,et al.  A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..

[13]  M. Powell A New Algorithm for Unconstrained Optimization , 1970 .

[14]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[15]  H. Luetkepohl The Handbook of Matrices , 1996 .

[16]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[17]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[18]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[19]  Philipp Hennig,et al.  Fast Probabilistic Optimization from Noisy Gradients , 2013, ICML.

[20]  C. G. Broyden Quasi-Newton methods and their application to function minimisation , 1967 .

[21]  William C. Davidon,et al.  Variable Metric Method for Minimization , 1959, SIAM J. Optim..

[22]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..