Scalability of Stochastic Gradient Descent based on "Smart" Sampling Techniques
暂无分享,去创建一个
Stéphan Clémençon | Aurélien Bellet | Ons Jelassi | Guillaume Papa | S. Clémençon | A. Bellet | Guillaume Papa | Ons Jelassi
[1] Léon Bottou,et al. Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.
[2] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[3] Tong Zhang,et al. Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling , 2014, ArXiv.
[4] Alan J. Lee,et al. U-Statistics: Theory and Practice , 1990 .
[5] Guillaume Papa,et al. Optimal survey schemes for stochastic gradient descent with applications to M-estimation , 2015, ESAIM: Probability and Statistics.
[6] Ron Bekkerman,et al. Scaling up Machine Learning , 2011 .
[7] Stéphan Clémençon,et al. Scaling up M-estimation via sampling designs: The Horvitz-Thompson stochastic gradient descent , 2014, 2014 IEEE International Conference on Big Data (Big Data).
[8] Amaury Habrard,et al. Robustness and generalization for metric learning , 2012, Neurocomputing.
[9] Rong Jin,et al. Regularized Distance Metric Learning: Theory and Algorithm , 2009, NIPS.
[10] Carlos Guestrin,et al. Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .
[11] Gonzalo Mateos,et al. Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.
[12] Stéphan Clémençon,et al. Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling , 2013, SDM.
[13] Michael I. Jordan. On statistics, computation and scalability , 2013, ArXiv.
[14] Stéphan Clémençon,et al. Scaling-up Empirical Risk Minimization: Optimization of Incomplete $U$-statistics , 2015, J. Mach. Learn. Res..
[15] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[16] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[17] Marc Sebban,et al. A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.
[18] Nathan Halko,et al. An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..
[19] Pascal Bianchi,et al. On-line learning gossip algorithm in multi-agent systems with local decision rules , 2013, 2013 IEEE International Conference on Big Data.
[20] Ohad Shamir,et al. Optimal Distributed Online Prediction , 2011, ICML.
[21] David Mort. The Statistics , 2020, Sources of Non-Official UK Statistics.
[22] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[23] Qiong Cao,et al. Generalization bounds for metric and similarity learning , 2012, Machine Learning.
[24] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[25] Tong Zhang,et al. Stochastic Optimization with Importance Sampling for Regularized Loss Minimization , 2014, ICML.
[26] Martin J. Wainwright,et al. Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.
[27] Andrea Montanari,et al. Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.