暂无分享,去创建一个
[1] Zaïd Harchaoui,et al. A Universal Catalyst for First-Order Optimization , 2015, NIPS.
[2] Jie Liu,et al. Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting , 2015, IEEE Journal of Selected Topics in Signal Processing.
[3] Peter Richtárik,et al. Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses , 2015, ArXiv.
[4] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[5] Peter Richtárik,et al. Semi-Stochastic Gradient Descent Methods , 2013, Front. Appl. Math. Stat..
[6] Aurélien Lucchi,et al. Variance Reduced Stochastic Gradient Descent with Neighbors , 2015, NIPS.
[7] Alexander J. Smola,et al. Efficient mini-batch training for stochastic optimization , 2014, KDD.
[8] Fabian Pedregosa,et al. ASAGA: Asynchronous Parallel SAGA , 2016, AISTATS.
[9] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..
[10] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[11] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[12] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[13] Peter Richtárik,et al. Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling , 2015, NIPS.
[14] Peter Richtárik,et al. Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.
[15] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..
[16] David W. Jacobs,et al. Big Batch SGD: Automated Inference using Adaptive Batch Sizes , 2016, ArXiv.
[17] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[18] Avleen Singh Bijral,et al. Mini-Batch Primal and Dual Methods for SVMs , 2013, ICML.
[19] Aaron Defazio,et al. New Optimisation Methods for Machine Learning , 2015, ArXiv.
[20] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[21] Alexander J. Smola,et al. On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants , 2015, NIPS.
[22] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[23] Inderjit S. Dhillon,et al. PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent , 2015, ICML.
[24] Mark W. Schmidt,et al. StopWasting My Gradients: Practical SVRG , 2015, NIPS.
[25] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Lin Xiao,et al. A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..
[28] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..