Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
暂无分享,去创建一个
Stephen J. Wright | Feng Niu | Benjamin Recht | Christopher Ré | B. Recht | Christopher Ré | Feng Niu
[1] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.
[2] Editors , 1986, Brain Research Bulletin.
[3] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[4] Luo Zhi-quan,et al. Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .
[5] Yuval Rabani,et al. An improved approximation algorithm for multiway cut , 1998, STOC '98.
[6] Paul Tseng,et al. An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule , 1998, SIAM J. Optim..
[7] D. Bertsekas,et al. Convergen e Rate of In remental Subgradient Algorithms , 2000 .
[8] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..
[9] Vladimir Kolmogorov,et al. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[11] Tommi S. Jaakkola,et al. Maximum-Margin Matrix Factorization , 2004, NIPS.
[12] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.
[13] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[14] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[15] Nathan Srebro,et al. SVM optimization: inverse dependence on training set size , 2008, ICML '08.
[16] John Langford,et al. Sparse Online Learning via Truncated Gradient , 2008, NIPS.
[17] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[18] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[19] Ruslan Salakhutdinov,et al. Practical Large-Scale Optimization for Max-norm Regularization , 2010, NIPS.
[20] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..
[21] Andrey Gubarev,et al. Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .
[22] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[23] Martin J. Wainwright,et al. Distributed Dual Averaging In Networks , 2010, NIPS.
[24] Haixun Wang,et al. Web Scale Entity Resolution using Relational Evidence , 2011 .
[25] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..
[26] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[27] Christopher Ré,et al. Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Math. Program. Comput..
[28] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .