暂无分享,去创建一个
Thomas Hofmann | Aurélien Lucchi | Hadi Daneshmand | Jonas Moritz Kohler | Thomas Hofmann | Aurélien Lucchi | J. Kohler | Hadi Daneshmand
[1] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.
[2] Yuanzhi Li,et al. Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.
[3] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[4] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.
[5] Kfir Y. Levy,et al. The Power of Normalization: Faster Evasion of Saddle Points , 2016, ArXiv.
[6] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[7] Peng Xu,et al. Newton-type methods for non-convex optimization under inexact Hessian information , 2017, Math. Program..
[8] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[9] Daniel P. Robinson,et al. Exploiting negative curvature in deterministic and stochastic optimization , 2017, Mathematical Programming.
[10] Alexander J. Smola,et al. A Generic Approach for Escaping Saddle points , 2017, AISTATS.
[11] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[12] Christopher J. Hillar,et al. Most Tensor Problems Are NP-Hard , 2009, JACM.
[13] Martin J. Wainwright,et al. Learning Halfspaces and Neural Networks with Random Initialization , 2015, ArXiv.
[14] Aurélien Lucchi,et al. Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.
[15] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[16] Stefano Soatto,et al. Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks , 2017, 2018 Information Theory and Applications Workshop (ITA).
[17] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[18] Yuchen Zhang,et al. A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics , 2017, COLT.
[19] Max Simchowitz,et al. On the Gap Between Strict-Saddles and True Convexity: An Omega(log d) Lower Bound for Eigenvector Approximation , 2017, ArXiv.
[20] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[21] Tianbao Yang,et al. First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.
[22] Yair Carmon,et al. "Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions , 2017, ICML.
[23] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[24] Nicholas I. M. Gould,et al. Trust Region Methods , 2000, MOS-SIAM Series on Optimization.
[25] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[26] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[27] Zeyuan Allen-Zhu,et al. Natasha 2: Faster Non-Convex Optimization Than SGD , 2017, NeurIPS.
[28] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[29] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.