暂无分享,去创建一个
Yi Zhou | Yingbin Liang | Vahid Tarokh | Junjie Yang | Huishuai Zhang | V. Tarokh | Yi Zhou | Huishuai Zhang | Yingbin Liang | Junjie Yang
[1] Xiaodong Li,et al. Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization , 2016, Applied and Computational Harmonic Analysis.
[2] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[3] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.
[4] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[5] S. Linnainmaa. Taylor expansion of the accumulated rounding error , 1976 .
[6] Yi Zhou,et al. Geometrical properties and accelerated gradient solvers of non-convex phase retrieval , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[7] Yingbin Liang,et al. A Nonconvex Approach for Phase Retrieval: Reshaped Wirtinger Flow and Incremental Algorithms , 2017, J. Mach. Learn. Res..
[8] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[9] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[10] Alexander J. Smola,et al. A Generic Approach for Escaping Saddle points , 2017, AISTATS.
[11] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[12] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.
[13] Yi Zhou,et al. Characterization of Gradient Dominance and Regularity Conditions for Neural Networks , 2017, ArXiv.
[14] Yi Zhou,et al. SpiderBoost: A Class of Faster Variance-reduced Algorithms for Nonconvex Optimization , 2018, ArXiv.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[17] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[20] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.
[21] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[22] Pramod K. Varshney,et al. Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization , 2017, ICML.
[23] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[24] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[25] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.
[26] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[27] H. Robbins. A Stochastic Approximation Method , 1951 .
[28] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[29] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[30] Yi Zhou,et al. Critical Points of Linear Neural Networks: Analytical Forms and Landscape Properties , 2017, ICLR.
[31] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[32] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[33] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[34] Yuanzhi Li,et al. An Alternative View: When Does SGD Escape Local Minima? , 2018, ICML.