论文信息 - Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization - 字舞流文

Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization

We develop two new stochastic Gauss-Newton algorithms for solving a class of non-convex stochastic compositional optimization problems frequently arising in practice. We consider both the expectation and finite-sum settings under standard assumptions, and use both classical stochastic and SARAH estimators for approximating function values and Jacobians. In the expectation case, we establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a stationary point in expectation and estimate the total number of stochastic oracle calls for both function value and its Jacobian, where $\varepsilon$ is a desired accuracy. In the finite sum case, we also estimate $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls with high probability. To our best knowledge, this is the first time such global stochastic oracle complexity is established for stochastic Gauss-Newton methods. Finally, we illustrate our theoretical results via two numerical examples on both synthetic and real datasets.

Lam M. Nguyen | Quoc Tran-Dinh | Nhan H. Pham | Q. Tran-Dinh

[1] Damek Davis,et al. Proximally Guided Stochastic Subgradient Method for Nonsmooth, Nonconvex Problems , 2017, SIAM J. Optim..

[2] R. Tyrrell Rockafellar,et al. Stochastic variational inequalities: single-stage to multistage , 2017, Math. Program..

[3] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[4] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.

[5] T. Chan,et al. Primal dual algorithms for convex models and applications to image restoration, registration and nonlocal inpainting , 2010 .

[6] Dmitriy Drusvyatskiy,et al. Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[7] Lam M. Nguyen,et al. ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization , 2019, J. Mach. Learn. Res..

[8] Mengdi Wang,et al. Accelerating Stochastic Composition Optimization , 2016, NIPS.

[9] Mengdi Wang,et al. Finite-sum Composition Optimization via Variance Reduced Gradient Descent , 2016, AISTATS.

[10] Robert D. Tortora,et al. Sampling: Design and Analysis , 2000 .

[11] Moritz Diehl,et al. Proximal methods for minimizing the sum of a convex function and a composite function , 2011 .

[12] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[13] Feng Ruan,et al. Stochastic Methods for Composite and Weakly Convex Optimization Problems , 2017, SIAM J. Optim..

[14] Alexander Shapiro,et al. Validation analysis of mirror descent stochastic approximation method , 2012, Math. Program..

[15] J. Blanchet,et al. Unbiased Simulation for Optimizing Stochastic Function Compositions , 2017, 1711.07564.

[16] Marten van Dijk,et al. Optimal Finite-Sum Smooth Non-Convex Optimization with SARAH , 2019, ArXiv.

[17] Junyu Zhang,et al. Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization , 2020, Mathematical Programming.

[18] Junyu Zhang,et al. A Stochastic Composite Gradient Method with Incremental Variance Reduction , 2019, NeurIPS.

[19] Lin Xiao,et al. MultiLevel Composite Stochastic Optimization via Nested Variance Reduction , 2019, SIAM J. Optim..

[20] Musa A. Mammadov,et al. From Convex to Nonconvex: A Loss Function Analysis for Binary Classification , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[21] Quoc Tran-Dinh,et al. Generalized self-concordant functions: a recipe for Newton-type methods , 2017, Mathematical Programming.

[22] Mengdi Wang,et al. Multilevel Stochastic Gradient Methods for Nested Composition Optimization , 2018, SIAM J. Optim..

[23] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[24] Dmitriy Drusvyatskiy,et al. Efficiency of minimizing compositions of convex functions and smooth maps , 2016, Math. Program..

[25] Yue Yu,et al. Fast Stochastic Variance Reduced ADMM for Stochastic Composition Optimization , 2017, IJCAI.

[26] Xiaoming Yuan,et al. Adaptive Primal-Dual Hybrid Gradient Methods for Saddle-Point Problems , 2013, 1305.0546.

[27] Zhiqiang Zhou,et al. Algorithms for stochastic optimization with function or expectation constraints , 2016, Comput. Optim. Appl..

[28] Guanghui Lan,et al. Algorithms for stochastic optimization with expectation constraints , 2016, 1604.03887.

[29] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[30] Mengdi Wang,et al. Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions , 2014, Mathematical Programming.

[31] Liu Liu,et al. Variance Reduced Methods for Non-Convex Composition Optimization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Volkan Cevher,et al. A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization , 2015, SIAM J. Optim..

[33] Yangyang Xu,et al. Katyusha Acceleration for Convex Finite-Sum Compositional Optimization , 2019, INFORMS J. Optim..

[34] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[35] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .

[36] Yurii Nesterov,et al. Modified Gauss–Newton scheme with worst case guarantees for global performance , 2007, Optim. Methods Softw..

[37] Stephen J. Wright,et al. A proximal method for composite minimization , 2008, Mathematical Programming.

[38] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[39] Q. Tran-Dinh. Proximal Alternating Penalty Algorithms for Constrained Convex Optimization , 2017, 1711.01367.

[40] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.