暂无分享,去创建一个
[1] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[2] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[3] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[4] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[5] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[6] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[7] John C. Duchi,et al. Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[8] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[9] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[10] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[11] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[12] Alexander J. Smola,et al. Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.
[13] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[14] Stephen J. Wright,et al. An Asynchronous Parallel Randomized Kaczmarz Algorithm , 2014, ArXiv.
[15] James T. Kwok,et al. Asynchronous Distributed ADMM for Consensus Optimization , 2014, ICML.
[16] James T. Kwok,et al. Fast Distributed Asynchronous SGD with Variance Reduction , 2015, ArXiv.
[17] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[18] Yijun Huang,et al. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization , 2015, NIPS.
[19] Alexander J. Smola,et al. On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants , 2015, NIPS.
[20] Bin Gu,et al. Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction , 2016, ArXiv.
[21] Bin Gu,et al. Asynchronous Stochastic Block Coordinate Descent with Variance Reduction , 2016, 1610.09447.
[22] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[23] Zeyuan Allen Zhu,et al. Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.
[24] Heng Huang,et al. Distributed Asynchronous Dual-Free Stochastic Dual Coordinate Ascent , 2016 .
[25] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.
[26] Wu-Jun Li,et al. Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee , 2016, AAAI.
[27] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[28] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[29] Mingyi Hong,et al. A Distributed, Asynchronous, and Incremental Algorithm for Nonconvex Optimization: An ADMM Approach , 2014, IEEE Transactions on Control of Network Systems.