论文信息 - Quadratic and Cubic Regularisation Methods with Inexact function and Random Derivatives for Finite-Sum Minimisation - 字舞流文

Quadratic and Cubic Regularisation Methods with Inexact function and Random Derivatives for Finite-Sum Minimisation

This paper focuses on regularisation methods using models up to the third order to search for up to second-order critical points of a finite-sum minimisation problem. The variant presented belongs to the framework of [1]: it employs random models with accuracy guaranteed with a sufficiently large prefixed probability and deterministic inexact function evaluations within a prescribed level of accuracy. Without assuming unbiased estimators, the expected number of iterations is ${\mathcal{O}}\left( { \in _1^{ - 2}} \right){\text{ or }}{\mathcal{O}}\left( { \in _1^{ - 3/2}} \right)$ when searching for a first-order critical point using a second or third order model, respectively, and of ${\mathcal{O}}\left( {\max \left[ { \in _1^{ - 3/2}, \in _2^{ - 3}} \right]} \right)$ when seeking for second-order critical points with a third order model, in which ${ \in _j},j \in \{ 1,2\}$, is the j th-order tolerance. These results match the worst-case optimal complexity for the deterministic counterpart of the method. Preliminary numerical tests for first-order optimality in the context of nonconvex binary classification in imaging, with and without Artifical Neural Networks (ANNs), are presented and discussed.

Stefania Bellavia | Benedetta Morini | Philippe L. Toint | Gianmarco Gurioli | P. Toint | S. Bellavia | G. Gurioli | B. Morini

[1] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[2] Marcos Raydan,et al. The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem , 1997, SIAM J. Optim..

[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.

[5] Nicholas I. M. Gould,et al. Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity , 2011, Math. Program..

[6] P. Toint,et al. An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity , 2012 .

[7] Nicholas I. M. Gould,et al. Complexity bounds for second-order optimality in unconstrained optimization , 2012, J. Complex..

[8] Joel A. Tropp,et al. An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[9] Marco Sciandrone,et al. On the use of iterative methods in cubic regularization for unconstrained optimization , 2015, Comput. Optim. Appl..

[10] Yair Carmon,et al. Gradient Descent Efficiently Finds the Cubic-Regularized Non-Convex Newton Step , 2016, ArXiv.

[11] José Mario Martínez,et al. Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[12] Aurélien Lucchi,et al. Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.

[13] Michael Unser,et al. Deep Convolutional Neural Network for Inverse Problems in Imaging , 2016, IEEE Transactions on Image Processing.

[14] Michael Unser,et al. Convolutional Neural Networks for Inverse Problems in Imaging: A Review , 2017, IEEE Signal Processing Magazine.

[15] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.

[16] Katya Scheinberg,et al. Stochastic optimization using a trust-region method and random models , 2015, Mathematical Programming.

[17] A Stochastic Line Search Method with Convergence Rate Analysis , 2018, 1807.07994.

[18] Tianyi Lin,et al. On Adaptive Cubic Regularized Newton's Methods for Convex Optimization via Random Sampling , 2018 .

[19] Valeria Ruggiero,et al. On the steplength selection in gradient methods for unconstrained optimization , 2018, Appl. Math. Comput..

[20] Peng Xu,et al. Inexact Non-Convex Newton-Type Methods , 2018, 1802.06925.

[21] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[22] Yair Carmon,et al. Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[23] Katya Scheinberg,et al. Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.

[24] Quanquan Gu,et al. Stochastic Variance-Reduced Cubic Regularization Methods , 2019, J. Mach. Learn. Res..

[25] S. Bellavia,et al. Adaptive Regularization Algorithms with Inexact Evaluations for Nonconvex Optimization , 2018, SIAM J. Optim..

[26] Katya Scheinberg,et al. Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales , 2016, INFORMS Journal on Optimization.

[27] Stefania Bellavia,et al. Stochastic analysis of an adaptive cubic regularization method under inexact gradient evaluations and dynamic Hessian accuracy , 2020, Optimization.

[28] Peng Xu,et al. Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study , 2017, SDM.

[29] Jean-Christophe Pesquet,et al. Deep unfolding of a proximal interior point method for image restoration , 2018, Inverse Problems.

[30] STEFANIA BELLAVIA,et al. Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization , 2018, IMA Journal of Numerical Analysis.

[31] Peng Xu,et al. Newton-type methods for non-convex optimization under inexact Hessian information , 2017, Math. Program..

[32] Nicholas I. M. Gould,et al. Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints , 2018, SIAM J. Optim..

[33] Jorge Nocedal,et al. An investigation of Newton-Sketch and subsampled Newton methods , 2017, Optim. Methods Softw..

[34] P. Toint,et al. Strong Evaluation Complexity Bounds for Arbitrary-Order Optimization of Nonconvex Nonsmooth Composite Functions , 2020, 2001.10802.

[35] High-order Evaluation Complexity of a Stochastic Adaptive Regularization Algorithm for Nonconvex Optimization Using Inexact Function Evaluations and Randomly Perturbed Derivatives , 2020, 2005.04639.

[36] K. Scheinberg,et al. Global Convergence Rate Analysis of a Generic Line Search Algorithm with Noise , 2019, SIAM J. Optim..

[37] Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging , 2021 .