SADAGRAD: Strongly Adaptive Stochastic Gradient Methods
暂无分享,去创建一个
Enhong Chen | Tianbao Yang | Yi Xu | Zaiyi Chen | Tianbao Yang | Yi Xu | Zaiyi Chen | Enhong Chen
[1] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[2] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[3] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[4] Tianbao Yang,et al. Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence , 2017, ICML.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Wouter M. Koolen,et al. MetaGrad: Multiple Learning Rates in Online Learning , 2016, NIPS.
[7] Yurii Nesterov,et al. Linear convergence of first order methods for non-strongly convex optimization , 2015, Math. Program..
[8] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..
[9] Mark W. Schmidt,et al. A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.
[10] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[11] Elad Hazan,et al. An optimal algorithm for stochastic strongly-convex optimization , 2010, 1006.2425.
[12] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[13] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.
[14] Panos Toulis,et al. Convergence diagnostics for stochastic gradient descent with constant learning rate , 2018, AISTATS.
[15] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[16] Guoyin Li,et al. Global error bounds for piecewise convex polynomials , 2013, Math. Program..
[17] Matthias Hein,et al. Variants of RMSProp and Adagrad with Logarithmic Regret Bounds , 2017, ICML.