On Nonconvex Optimization for Machine Learning
暂无分享,去创建一个
[1] Boris Polyak. Gradient methods for the minimisation of functionals , 1963 .
[2] Zeyuan Allen-Zhu,et al. Natasha 2: Faster Non-Convex Optimization Than SGD , 2017, NeurIPS.
[3] Yair Carmon,et al. Lower bounds for finding stationary points I , 2017, Mathematical Programming.
[4] Yair Carmon,et al. Lower bounds for finding stationary points II: first-order methods , 2017, Mathematical Programming.
[5] Yair Carmon,et al. Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..
[6] Quanquan Gu,et al. Stochastic Recursive Variance-Reduced Cubic Regularization Methods , 2019, AISTATS.
[7] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[8] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..
[9] Anima Anandkumar,et al. Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.
[10] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.
[11] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.
[12] Yair Carmon,et al. Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations , 2020, COLT.
[13] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.
[14] Nathan Srebro,et al. Lower Bounds for Non-Convex Stochastic Optimization , 2019, ArXiv.
[15] Quanquan Gu,et al. Finding Local Minima via Stochastic Nested Variance Reduction , 2018, ArXiv.
[16] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[17] Prateek Jain,et al. Non-convex Robust PCA , 2014, NIPS.
[18] Yurii Nesterov,et al. Squared Functional Systems and Optimization Problems , 2000 .
[19] Michael I. Jordan,et al. First-order methods almost always avoid strict saddle points , 2019, Mathematical Programming.
[20] Michael I. Jordan,et al. Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.
[21] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.
[22] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[23] A. Bovier,et al. Metastability in Reversible Diffusion Processes I: Sharp Asymptotics for Capacities and Exit Times , 2004 .
[24] Alexander J. Smola,et al. A Generic Approach for Escaping Saddle points , 2017, AISTATS.
[25] Tong Zhang,et al. SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.
[26] Kfir Y. Levy,et al. The Power of Normalization: Faster Evasion of Saddle Points , 2016, ArXiv.
[27] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[28] Nicolas Boumal,et al. The non-convex Burer-Monteiro approach works on smooth semidefinite programs , 2016, NIPS.
[29] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.
[30] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[31] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.
[32] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.
[33] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.
[34] Yuanzhi Li,et al. Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.
[35] R. Tweedie,et al. Exponential convergence of Langevin distributions and their discrete approximations , 1996 .
[36] Yair Carmon,et al. Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step , 2019, SIAM J. Optim..
[37] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[38] Yuchen Zhang,et al. A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics , 2017, COLT.
[39] Zhouchen Lin,et al. Sharp Analysis for Nonconvex SGD Escaping from Saddle Points , 2019, COLT.
[40] Michael I. Jordan,et al. A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm , 2019, ArXiv.
[41] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.
[42] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[43] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[44] Nicolas Boumal,et al. On the low-rank approach for semidefinite programs arising in synchronization and community detection , 2016, COLT.
[45] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[46] Andrea Montanari,et al. Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality , 2017, COLT.
[47] H. Robbins. A Stochastic Approximation Method , 1951 .
[48] Tianbao Yang,et al. First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.
[49] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[50] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.