On the Optimality of Averaging in Distributed Statistical Learning
暂无分享,去创建一个
[1] P. J. Huber. Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .
[2] S. Orszag,et al. Advanced Mathematical Methods For Scientists And Engineers , 1979 .
[3] Hani Doss,et al. Bias Reduction When There Is No Unbiased Estimate. , 1989 .
[4] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[5] Aman Ullah,et al. The second-order bias and mean squared error of nonlinear estimators , 1996 .
[6] A. V. D. Vaart,et al. Asymptotic Statistics: Frontmatter , 1998 .
[7] A. Rukhin. Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.
[8] S. Orszag,et al. Advanced mathematical methods for scientists and engineers I: asymptotic methods and perturbation theory. , 1999 .
[9] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[10] S. Rosset,et al. Piecewise linear regularized solution paths , 2007, 0708.2197.
[11] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[12] Nathan Srebro,et al. SVM optimization: inverse dependence on training set size , 2008, ICML '08.
[13] Gideon S. Mann,et al. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.
[14] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).
[15] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[16] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[17] John Langford,et al. Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.
[18] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[19] Alfred O. Hero,et al. Distributed principal component analysis on networks via directed graphical models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Martin J. Wainwright,et al. Divide and Conquer Kernel Ridge Regression , 2013, COLT.
[21] P. Bickel,et al. Optimal M-estimation in high-dimensional regression , 2013, Proceedings of the National Academy of Sciences.
[22] P. Bickel,et al. On robust regression with high-dimensional predictors , 2013, Proceedings of the National Academy of Sciences.
[23] Noureddine El Karoui,et al. Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators : rigorous results , 2013, 1311.2445.
[24] Andrea Montanari,et al. High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.
[25] Tim Kraska,et al. MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.
[26] Shie Mannor,et al. Distributed Robust Learning , 2014, ArXiv.
[27] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[28] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[29] Qiang Liu,et al. Distributed Estimation, Information Loss and Exponential Families , 2014, NIPS.
[30] Dan Crisan,et al. A simple scheme for the parallelization of particle filters and its application to the tracking of complex stochastic systems , 2014, 1407.8071.
[31] Qiang Liu,et al. Communication-efficient sparse regression: a one-shot approach , 2015, ArXiv.
[32] Stanislav Minsker. Geometric median and robust estimation in Banach spaces , 2013, 1308.1334.
[33] K. Kim. Higher Order Bias Correcting Moment Equation for M-Estimation and Its Higher Order Efficiency , 2016 .
[34] Daniel J. Hsu,et al. Loss Minimization and Parameter Estimation with Heavy Tails , 2013, J. Mach. Learn. Res..