On the construction of probabilistic Newton-type algorithms

It has recently been shown that many of the existing quasi-Newton algorithms can be formulated as learning algorithms, capable of learning local models of the cost functions. Importantly, this understanding allows us to safely start assembling probabilistic Newton-type algorithms, applicable in situations where we only have access to noisy observations of the cost function and its derivatives. This is where our interest lies. We make contributions to the use of the non-parametric and probabilistic Gaussian process models in solving these stochastic optimisation problems. Specifically, we present a new algorithm that unites these approximations together with recent probabilistic line search routines to deliver a probabilistic quasi-Newton approach. We also show that the probabilistic optimisation algorithms deliver promising results on challenging nonlinear system identification problems where the very nature of the problem is such that we can only access the cost function and its derivative via noisy observations, since there are no closed-form expressions available.

[1]  Johan Dahlin,et al.  Sequential Monte Carlo Methods for System Identification , 2015, 1503.06058.

[2]  Martin Kiefel,et al.  Quasi-Newton Methods: A New Direction , 2012, ICML.

[3]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4]  Jan R. Magnus,et al.  The Elimination Matrix: Some Lemmas and Applications , 1980, SIAM J. Algebraic Discret. Methods.

[5]  F. Lindsten,et al.  Particle Filter-Based Gaussian Process Optimisation for Parameter Inference , 2013, 1311.0689.

[6]  Thomas B. Schön,et al.  System identification of nonlinear state-space models , 2011, Autom..

[7]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[8]  C. G. Broyden A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .

[9]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[10]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .

[11]  Iain Murray Introduction To Gaussian Processes , 2008 .

[12]  Michael A. Osborne,et al.  Gaussian Processes for Global Optimization , 2008 .

[13]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[14]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[15]  Philipp Hennig,et al.  Fast Probabilistic Optimization from Noisy Gradients , 2013, ICML.

[16]  Andrew Gordon Wilson,et al.  Student-t Processes as Alternatives to Gaussian Processes , 2014, AISTATS.

[17]  C. G. Broyden Quasi-Newton methods and their application to function minimisation , 1967 .

[18]  Philipp Hennig,et al.  Probabilistic Interpretation of Linear Solvers , 2014, SIAM J. Optim..

[19]  Philipp Hennig,et al.  Probabilistic Line Searches for Stochastic Optimization , 2015, NIPS.

[20]  Niklas Wahlstrom,et al.  Modeling of Magnetic Fields and Extended Objects for Localization Applications , 2015 .

[21]  A. Doucet,et al.  Monte Carlo Smoothing for Nonlinear Time Series , 2004, Journal of the American Statistical Association.

[22]  Jorge Nocedal,et al.  A Stochastic Quasi-Newton Method for Large-Scale Optimization , 2014, SIAM J. Optim..

[23]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[24]  Larry Nazareth,et al.  A family of variable metric updates , 1977, Math. Program..

[25]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[26]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[27]  Arnaud Doucet,et al.  On Particle Methods for Parameter Estimation in State-Space Models , 2014, 1412.8695.

[28]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[29]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[30]  Thomas B. Schön,et al.  Estimation of general nonlinear state-space systems , 2010, 49th IEEE Conference on Decision and Control (CDC).

[31]  Roger Fletcher,et al.  A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..