Deterministic and stochastic inexact regularization algorithms for nonconvex optimization with optimal complexity

A regularization algorithm using inexact function values and inexact derivatives is proposed and its evaluation complexity analyzed. This algorithm is applicable to unconstrained problems and to problems with inexpensive constraints (that is constraints whose evaluation and enforcement has negligible cost) under the assumption that the derivative of highest degree is $\beta$-H\"{o}lder continuous. It features a very flexible adaptive mechanism for determining the inexactness which is allowed, at each iteration, when computing objective function values and derivatives. The complexity analysis covers arbitrary optimality order and arbitrary degree of available approximate derivatives. It extends results of Cartis, Gould and Toint (2018) on the evaluation complexity to the inexact case: if a $q$th order minimizer is sought using approximations to the first $p$ derivatives, it is proved that a suitable approximate minimizer within $\epsilon$ is computed by the proposed algorithm in at most $O(\epsilon^{-\frac{p+\beta}{p-q+\beta}})$ iterations and at most $O(|\log(\epsilon)|\epsilon^{-\frac{p+\beta}{p-q+\beta}})$ approximate evaluations. While the proposed framework remains so far conceptual for high degrees and orders, it is shown to yield simple and computationally realistic inexact methods when specialized to the unconstrained and bound-constrained first- and second-order cases. The deterministic complexity results are finally extended to the stochastic context, yielding adaptive sample-size rules for subsampling methods typical of machine learning.

[1]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[2]  A Stochastic Line Search Method with Convergence Rate Analysis , 2018, 1807.07994.

[3]  J. Dussault Simple unified convergence proofs for Trust Region and a new ARC variant , 2015 .

[4]  Stefania Bellavia,et al.  A Levenberg–Marquardt method for large nonlinear least-squares problems with dynamic accuracy in functions and gradients , 2018, Numerische Mathematik.

[5]  Serge Gratton,et al.  Minimizing convex quadratic with variable precision Krylov methods , 2018, ArXiv.

[6]  P. Toint,et al.  Trust-region and other regularisations of linear least-squares problems , 2009 .

[7]  Cho-Jui Hsieh,et al.  Stochastic Second-order Methods for Non-convex Optimization with Inexact Hessian and Gradient , 2018, ArXiv.

[8]  Stephen A. Vavasis,et al.  Black-Box Complexity of Local Minimization , 1993, SIAM J. Optim..

[9]  Peng Xu,et al.  Newton-type methods for non-convex optimization under inexact Hessian information , 2017, Math. Program..

[10]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[11]  Philippe L. Toint,et al.  WORST-CASE EVALUATION COMPLEXITY AND OPTIMALITY OF SECOND-ORDER METHODS FOR NONCONVEX SMOOTH OPTIMIZATION , 2017, Proceedings of the International Congress of Mathematicians (ICM 2018).

[12]  Peng Xu,et al.  Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study , 2017, SDM.

[13]  Peng Xu,et al.  Inexact Nonconvex Newton-Type Methods , 2018, INFORMS Journal on Optimization.

[14]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[15]  Serge Gratton,et al.  Recursive Trust-Region Methods for Multiscale Nonlinear Optimization , 2008, SIAM J. Optim..

[16]  Nicholas I. M. Gould,et al.  Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints , 2018, SIAM J. Optim..

[17]  P. Toint,et al.  Adaptive cubic overestimation methods for unconstrained optimization , 2007 .

[18]  Vyacheslav Kungurtsev,et al.  A Subsampling Line-Search Method with Second-Order Results , 2018, INFORMS J. Optim..

[19]  José Mario Martínez,et al.  Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[20]  Katya Scheinberg,et al.  Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales , 2016, INFORMS Journal on Optimization.

[21]  Michael I. Jordan,et al.  Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.

[22]  Katya Scheinberg,et al.  Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.

[23]  Tianyi Lin,et al.  On Adaptive Cubic Regularized Newton's Methods for Convex Optimization via Random Sampling , 2018 .