A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates
暂无分享,去创建一个
[1] Ohad Shamir,et al. Distributed stochastic optimization and learning , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[2] John C. Duchi,et al. Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[3] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[4] Alexander J. Smola,et al. AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization , 2015, ArXiv.
[5] Fabian Pedregosa,et al. Improved asynchronous parallel optimization analysis for stochastic incremental methods , 2018, J. Mach. Learn. Res..
[6] Nathan Srebro,et al. Tight Complexity Bounds for Optimizing Composite Objectives , 2016, NIPS.
[7] Ohad Shamir,et al. On Lower and Upper Bounds in Smooth and Strongly Convex Optimization , 2016, J. Mach. Learn. Res..
[8] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[9] Francis R. Bach,et al. From Averaging to Acceleration, There is Only a Step-size , 2015, COLT.
[10] Victor Y. Pan,et al. How Bad Are Vandermonde Matrices? , 2015, SIAM J. Matrix Anal. Appl..
[11] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[12] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[13] Yurii Nesterov,et al. Lectures on Convex Optimization , 2018 .
[14] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[15] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[16] András György,et al. Online Learning under Delayed Feedback , 2013, ICML.
[17] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[18] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[19] Hamid Reza Feyzmahdavian,et al. An asynchronous mini-batch algorithm for regularized stochastic optimization , 2015, CDC.
[20] Christopher Ré,et al. Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care , 2015, NIPS.
[21] Vivek S. Borkar,et al. Distributed Asynchronous Incremental Subgradient Methods , 2001 .
[22] Xiaojing Ye,et al. Decentralized Consensus Algorithm with Delayed and Stochastic Gradients , 2016, SIAM J. Optim..
[23] Yi Zhou,et al. An optimal randomized incremental gradient method , 2015, Mathematical Programming.
[24] Hamid Reza Feyzmahdavian,et al. A delayed proximal gradient method with linear convergence rate , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).
[25] Stephen J. Wright,et al. Behavior of accelerated gradient methods near critical points of nonconvex functions , 2017, Math. Program..
[26] Ohad Shamir,et al. Better Mini-Batch Algorithms via Accelerated Gradient Methods , 2011, NIPS.