A Bregman extension of quasi-Newton updates II: Analysis of robustness properties

In Part I of this series of articles, we introduced the information geometric framework of quasi-Newton methods and gave an extension of Hessian update formulas based on the Bregman divergence. The purpose of this article is to investigate the convergence and robustness properties of extended Hessian update formulas. Fletcher has studied a variational problem which derives the approximate Hessian update formula of the quasi-Newton methods. We point out that the variational problem is identical to optimization of the Kullback-Leibler divergence, which is a discrepancy measure between two probability distributions. Then, we introduce the Bregman divergence as an extension of the Kullback-Leibler divergence, and derive extended quasi-Newton update formulas based on the variational problem with the Bregman divergence. The proposed update formulas belong to a class of self-scaling quasi-Newton methods. We study the convergence property of the proposed quasi-Newton method. Moreover, we apply tools in the robust statistics to analyze the robustness properties of Hessian update formulas against numerical rounding errors or a shift of tuning parameters included in line search methods of the step length. As the main contribution of this paper, we present that the influence of perturbations in the line search is bounded only for the standard BFGS formula for the Hessian approximation. Numerical studies are conducted to verify the usefulness of the tools borrowed from the robust statistics.

[1]  Jorge Nocedal,et al.  Theory of algorithms for unconstrained optimization , 1992, Acta Numerica.

[2]  Takafumi Kanamori,et al.  A Bregman extension of quasi-Newton updates I: an information geometrical framework , 2010, Optim. Methods Softw..

[3]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[4]  Jorge Nocedal,et al.  Analysis of a self-scaling quasi-Newton method , 1993, Math. Program..

[5]  Roger Fletcher,et al.  A New Variational Result for Quasi-Newton Formulae , 1991, SIAM J. Optim..

[6]  D. Luenberger,et al.  Self-Scaling Variable Metric (SSVM) Algorithms , 1974 .

[7]  J. Dennis,et al.  Sizing and least-change secant methods , 1993 .

[8]  Takafumi Kanamori,et al.  Information Geometry of U-Boost and Bregman Divergence , 2004, Neural Computation.

[9]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[10]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[11]  Roger Fletcher,et al.  An Optimal Positive Definite Update for Sparse Hessian Matrices , 1995, SIAM J. Optim..

[12]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[13]  M. J. D. Powell,et al.  How bad are the BFGS and DFP methods when the objective function is quadratic? , 1986, Math. Program..

[14]  Nobuo Yamashita,et al.  Analysis of Sparse Quasi-Newton Updates with Positive Definite Matrix Completion , 2014, Journal of the Operations Research Society of China.

[15]  C. G. Broyden Quasi-Newton methods and their application to function minimisation , 1967 .

[16]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[17]  Nicholas I. M. Gould,et al.  CUTEr and SifDec: A constrained and unconstrained testing environment, revisited , 2003, TOMS.

[18]  P. Toint,et al.  Testing a class of methods for solving minimization problems with simple bounds on the variables , 1988 .

[19]  P. Gill,et al.  Quasi-Newton Methods for Unconstrained Optimization , 1972 .

[20]  Inderjit S. Dhillon,et al.  Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..

[21]  J. Nocedal,et al.  Global Convergence of a Class of Quasi-newton Methods on Convex Problems, Siam Some Global Convergence Properties of a Variable Metric Algorithm for Minimization without Exact Line Searches, Nonlinear Programming, Edited , 1996 .