论文信息 - On Nonconvex Optimization for Machine Learning - 字舞流文

On Nonconvex Optimization for Machine Learning

Michael I. Jordan | S. Kakade | Chi Jin | Praneeth Netrapalli | M.I. Jordan | Rong Ge

[1] Boris Polyak. Gradient methods for the minimisation of functionals , 1963 .

[2] Zeyuan Allen-Zhu,et al. Natasha 2: Faster Non-Convex Optimization Than SGD , 2017, NeurIPS.

[3] Yair Carmon,et al. Lower bounds for finding stationary points I , 2017, Mathematical Programming.

[4] Yair Carmon,et al. Lower bounds for finding stationary points II: first-order methods , 2017, Mathematical Programming.

[5] Yair Carmon,et al. Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[6] Quanquan Gu,et al. Stochastic Recursive Variance-Reduced Cubic Regularization Methods , 2019, AISTATS.

[7] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[8] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..

[9] Anima Anandkumar,et al. Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.

[10] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[11] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.

[12] Yair Carmon,et al. Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations , 2020, COLT.

[13] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[14] Nathan Srebro,et al. Lower Bounds for Non-Convex Stochastic Optimization , 2019, ArXiv.

[15] Quanquan Gu,et al. Finding Local Minima via Stochastic Nested Variance Reduction , 2018, ArXiv.

[16] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[17] Prateek Jain,et al. Non-convex Robust PCA , 2014, NIPS.

[18] Yurii Nesterov,et al. Squared Functional Systems and Optimization Problems , 2000 .

[19] Michael I. Jordan,et al. First-order methods almost always avoid strict saddle points , 2019, Mathematical Programming.

[20] Michael I. Jordan,et al. Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.

[21] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[22] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[23] A. Bovier,et al. Metastability in Reversible Diffusion Processes I: Sharp Asymptotics for Capacities and Exit Times , 2004 .

[24] Alexander J. Smola,et al. A Generic Approach for Escaping Saddle points , 2017, AISTATS.

[25] Tong Zhang,et al. SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.

[26] Kfir Y. Levy,et al. The Power of Normalization: Faster Evasion of Saddle Points , 2016, ArXiv.

[27] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.

[28] Nicolas Boumal,et al. The non-convex Burer-Monteiro approach works on smooth semidefinite programs , 2016, NIPS.

[29] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[30] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.

[31] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[32] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[33] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[34] Yuanzhi Li,et al. Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.

[35] R. Tweedie,et al. Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[36] Yair Carmon,et al. Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step , 2019, SIAM J. Optim..

[37] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[38] Yuchen Zhang,et al. A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics , 2017, COLT.

[39] Zhouchen Lin,et al. Sharp Analysis for Nonconvex SGD Escaping from Saddle Points , 2019, COLT.

[40] Michael I. Jordan,et al. A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm , 2019, ArXiv.

[41] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.

[42] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[43] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[44] Nicolas Boumal,et al. On the low-rank approach for semidefinite programs arising in synchronization and community detection , 2016, COLT.

[45] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[46] Andrea Montanari,et al. Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality , 2017, COLT.

[47] H. Robbins. A Stochastic Approximation Method , 1951 .

[48] Tianbao Yang,et al. First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.

[49] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[50] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.