Quadratic and Cubic Regularisation Methods with Inexact function and Random Derivatives for Finite-Sum Minimisation

This paper focuses on regularisation methods using models up to the third order to search for up to second-order critical points of a finite-sum minimisation problem. The variant presented belongs to the framework of [1]: it employs random models with accuracy guaranteed with a sufficiently large prefixed probability and deterministic inexact function evaluations within a prescribed level of accuracy. Without assuming unbiased estimators, the expected number of iterations is ${\mathcal{O}}\left( { \in _1^{ - 2}} \right){\text{ or }}{\mathcal{O}}\left( { \in _1^{ - 3/2}} \right)$ when searching for a first-order critical point using a second or third order model, respectively, and of ${\mathcal{O}}\left( {\max \left[ { \in _1^{ - 3/2}, \in _2^{ - 3}} \right]} \right)$ when seeking for second-order critical points with a third order model, in which ${ \in _j},j \in \{ 1,2\}$, is the j th-order tolerance. These results match the worst-case optimal complexity for the deterministic counterpart of the method. Preliminary numerical tests for first-order optimality in the context of nonconvex binary classification in imaging, with and without Artifical Neural Networks (ANNs), are presented and discussed.

[1]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[2]  Marcos Raydan,et al.  The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem , 1997, SIAM J. Optim..

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Nicol N. Schraudolph,et al.  Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.

[5]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity , 2011, Math. Program..

[6]  P. Toint,et al.  An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity , 2012 .

[7]  Nicholas I. M. Gould,et al.  Complexity bounds for second-order optimality in unconstrained optimization , 2012, J. Complex..

[8]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[9]  Marco Sciandrone,et al.  On the use of iterative methods in cubic regularization for unconstrained optimization , 2015, Comput. Optim. Appl..

[10]  Yair Carmon,et al.  Gradient Descent Efficiently Finds the Cubic-Regularized Non-Convex Newton Step , 2016, ArXiv.

[11]  José Mario Martínez,et al.  Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[12]  Aurélien Lucchi,et al.  Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.

[13]  Michael Unser,et al.  Deep Convolutional Neural Network for Inverse Problems in Imaging , 2016, IEEE Transactions on Image Processing.

[14]  Michael Unser,et al.  Convolutional Neural Networks for Inverse Problems in Imaging: A Review , 2017, IEEE Signal Processing Magazine.

[15]  Tengyu Ma,et al.  Finding approximate local minima faster than gradient descent , 2016, STOC.

[16]  Katya Scheinberg,et al.  Stochastic optimization using a trust-region method and random models , 2015, Mathematical Programming.

[17]  A Stochastic Line Search Method with Convergence Rate Analysis , 2018, 1807.07994.

[18]  Tianyi Lin,et al.  On Adaptive Cubic Regularized Newton's Methods for Convex Optimization via Random Sampling , 2018 .

[19]  Valeria Ruggiero,et al.  On the steplength selection in gradient methods for unconstrained optimization , 2018, Appl. Math. Comput..

[20]  Peng Xu,et al.  Inexact Nonconvex Newton-Type Methods , 2018, INFORMS Journal on Optimization.

[21]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[22]  Yair Carmon,et al.  Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[23]  Katya Scheinberg,et al.  Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.

[24]  Quanquan Gu,et al.  Stochastic Variance-Reduced Cubic Regularization Methods , 2019, J. Mach. Learn. Res..

[25]  S. Bellavia,et al.  Adaptive Regularization Algorithms with Inexact Evaluations for Nonconvex Optimization , 2018, SIAM J. Optim..

[26]  Katya Scheinberg,et al.  Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales , 2016, INFORMS Journal on Optimization.

[27]  Stefania Bellavia,et al.  Stochastic analysis of an adaptive cubic regularization method under inexact gradient evaluations and dynamic Hessian accuracy , 2020, Optimization.

[28]  Peng Xu,et al.  Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study , 2017, SDM.

[29]  Jean-Christophe Pesquet,et al.  Deep unfolding of a proximal interior point method for image restoration , 2018, Inverse Problems.

[30]  STEFANIA BELLAVIA,et al.  Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization , 2018, IMA Journal of Numerical Analysis.

[31]  Peng Xu,et al.  Newton-type methods for non-convex optimization under inexact Hessian information , 2017, Math. Program..

[32]  Nicholas I. M. Gould,et al.  Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints , 2018, SIAM J. Optim..

[33]  Jorge Nocedal,et al.  An investigation of Newton-Sketch and subsampled Newton methods , 2017, Optim. Methods Softw..

[34]  P. Toint,et al.  Strong Evaluation Complexity Bounds for Arbitrary-Order Optimization of Nonconvex Nonsmooth Composite Functions , 2020, 2001.10802.

[35]  High-order Evaluation Complexity of a Stochastic Adaptive Regularization Algorithm for Nonconvex Optimization Using Inexact Function Evaluations and Randomly Perturbed Derivatives , 2020, 2005.04639.

[36]  K. Scheinberg,et al.  Global Convergence Rate Analysis of a Generic Line Search Algorithm with Noise , 2019, SIAM J. Optim..

[37]  Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging , 2021 .