暂无分享,去创建一个
[1] Yi Yang,et al. A Unified Analysis of Stochastic Momentum Methods for Deep Learning , 2018, IJCAI.
[2] Enhong Chen,et al. Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions , 2018, ICLR.
[3] Michael I. Jordan,et al. Non-convex Finite-Sum Optimization Via SCSG Methods , 2017, NIPS.
[4] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[5] Raef Bassily,et al. On exponential convergence of SGD in non-convex over-parametrized learning , 2018, ArXiv.
[6] Elad Hazan,et al. An optimal algorithm for stochastic strongly-convex optimization , 2010, 1006.2425.
[7] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Damek Davis,et al. Proximally Guided Stochastic Subgradient Method for Nonsmooth, Nonconvex Problems , 2017, SIAM J. Optim..
[10] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[11] Yan Yan,et al. Stagewise Training Accelerates Convergence of Testing Error Over SGD , 2018, NeurIPS.
[12] Mark W. Schmidt,et al. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.
[13] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[14] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[15] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[16] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[17] Bruce W. Suter,et al. From error bounds to the complexity of first-order descent methods for convex functions , 2015, Math. Program..
[18] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.
[19] Mark W. Schmidt,et al. A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.
[20] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[21] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[24] Dmitriy Drusvyatskiy,et al. Stochastic subgradient method converges at the rate O(k-1/4) on weakly convex functions , 2018, ArXiv.
[25] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.