A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity results to be proved for convergence to points that satisfy approximate first-order and second-order optimality conditions. The complexity results match the best known results in the literature for second-order methods.

[1]  T. Steihaug The Conjugate Gradient Method and Trust Regions in Large Scale Optimization , 1983 .

[2]  Trond Steihaug,et al.  Truncated-newtono algorithms for large-scale unconstrained optimization , 1983, Math. Program..

[3]  Henryk Wozniakowski,et al.  Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992, SIAM J. Matrix Anal. Appl..

[4]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[5]  S. Lucidi,et al.  Exploiting negative curvature directions in linesearch methods for unconstrained optimization , 2000 .

[6]  Stephen J. Wright,et al.  Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .

[7]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[8]  Nicholas I. M. Gould,et al.  Trust Region Methods , 2000, MOS-SIAM Series on Optimization.

[9]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[10]  Stefano Lucidi,et al.  A nonmonotone truncated Newton–Krylov method exploiting negative curvature directions, for large scale unconstrained optimization , 2009, Optim. Lett..

[11]  Nicholas I. M. Gould,et al.  On the Complexity of Steepest Descent, Newton's and Regularized Newton's Methods for Nonconvex Unconstrained Optimization Problems , 2010, SIAM J. Optim..

[12]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..

[13]  W. Marsden I and J , 2012 .

[14]  Nicholas I. M. Gould,et al.  Complexity bounds for second-order optimality in unconstrained optimization , 2012, J. Complex..

[15]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[16]  S. Vavasis,et al.  A unified convergence bound for conjugate gradient and accelerated gradient , 2016, 1605.00320.

[17]  Yair Carmon,et al.  Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..

[18]  Daniel P. Robinson,et al.  An Inexact Regularized Newton Framework with a Worst-Case Iteration Complexity of $\mathcal{O}(\epsilon^{-3/2})$ for Nonconvex Optimization , 2017, 1708.00475.

[19]  S. Vavasis,et al.  A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent , 2017, 1712.09498.

[20]  Daniel P. Robinson,et al.  A trust region algorithm with a worst-case iteration complexity of O(ϵ-3/2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{docume , 2016, Mathematical Programming.

[21]  Yair Carmon,et al.  "Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions , 2017, ICML.

[22]  Tianbao Yang,et al.  NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization , 2017, 1712.01033.

[23]  José Mario Martínez,et al.  The Use of Quadratic Regularization with a Cubic Descent Condition for Unconstrained Optimization , 2017, SIAM J. Optim..

[24]  José Mario Martínez,et al.  Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization , 2017, J. Glob. Optim..

[25]  Tengyu Ma,et al.  Finding approximate local minima faster than gradient descent , 2016, STOC.

[26]  Michael I. Jordan,et al.  Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[27]  R. Sarpong,et al.  Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.

[28]  Stephen J. Wright,et al.  Complexity Analysis of Second-Order Line-Search Algorithms for Smooth Nonconvex Optimization , 2017, SIAM J. Optim..

[29]  Tianbao Yang,et al.  First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.

[30]  Frank E. Curtis An inexact regularized Newton framework with a worst-case iteration complexity of O(ε−3/2) for nonconvex optimization , 2018 .

[31]  Yuanzhi Li,et al.  Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.

[32]  Yair Carmon,et al.  Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[33]  Philippe L. Toint,et al.  WORST-CASE EVALUATION COMPLEXITY AND OPTIMALITY OF SECOND-ORDER METHODS FOR NONCONVEX SMOOTH OPTIMIZATION , 2017, Proceedings of the International Congress of Mathematicians (ICM 2018).

[34]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.