论文信息 - Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Given a non-convex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue −σ of the Hessian. This parameter σ captures how strongly non-convex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known results for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ0 so that the (currently) fastest methods for σ > σ0 and for σ < σ0 have different behaviors: the former scales with n and the latter scales with n.

Zeyuan Allen Zhu | Z. Zhu

[1] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.

[2] Sham M. Kakade,et al. Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization , 2015, ICML.

[3] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[4] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.

[5] Yair Carmon,et al. Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..

[6] Tengyu Ma,et al. Finding Approximate Local Minima for Nonconvex Optimization in Linear Time , 2016, ArXiv.

[7] Shai Shalev-Shwartz,et al. SDCA without Duality, Regularization, and Individual Convexity , 2016, ICML.

[8] Yuanzhi Li,et al. Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition , 2016, ICML.

[9] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[10] Yurii Nesterov,et al. Accelerating the cubic regularization of Newton’s method on convex problems , 2005, Math. Program..

[11] Rong Jin,et al. Linear Convergence with Condition Number Independent Access of Full Gradients , 2013, NIPS.