Information Theoretic Limits of Data Shuffling for Distributed Learning
暂无分享,去创建一个
[1] Mohammad Ali Maddah-Ali,et al. Coded MapReduce , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[2] Urs Niesen,et al. Fundamental limits of caching , 2012, 2013 IEEE International Symposium on Information Theory.
[3] Kannan Ramchandran,et al. Speeding Up Distributed Machine Learning Using Codes , 2015, IEEE Transactions on Information Theory.
[4] Ohad Shamir,et al. Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization , 2016, ArXiv.
[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[6] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[7] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[8] Asuman E. Ozdaglar,et al. Why random reshuffling beats stochastic gradient descent , 2015, Mathematical Programming.