A Selective Review on Statistical Techniques for Big Data
暂无分享,去创建一个
[1] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .
[2] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.
[3] R. Tibshirani,et al. REJOINDER TO "LEAST ANGLE REGRESSION" BY EFRON ET AL. , 2004, math/0406474.
[4] Bernard Chazelle,et al. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.
[5] S. Muthukrishnan,et al. Sampling algorithms for l2 regression and applications , 2006, SODA '06.
[6] Nir Ailon,et al. Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes , 2008, SODA '08.
[7] Bernard Chazelle,et al. The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..
[8] Sivan Toledo,et al. Blendenpik: Supercharging LAPACK's Least-Squares Solver , 2010, SIAM J. Sci. Comput..
[9] AvronHaim,et al. Blendenpik: Supercharging LAPACK's Least-Squares Solver , 2010 .
[10] Michael W. Mahoney. Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..
[11] S. Muthukrishnan,et al. Faster least squares approximation , 2007, Numerische Mathematik.
[12] Ruibin Xi,et al. Aggregated estimating equation estimation , 2011 .
[13] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.
[14] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[15] Xi Chen,et al. Variance Reduction for Stochastic Gradient Optimization , 2013, NIPS.
[16] F. Liang,et al. A Resampling-Based Stochastic Approximation Method for Analysis of Large Geostatistical Data , 2013 .
[17] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[18] Han Liu,et al. Challenges of Big Data Analysis. , 2013, National science review.
[19] Min‐ge Xie,et al. A split-and-conquer approach for analysis of , 2014 .
[20] Trevor Hastie,et al. LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS. , 2013, Annals of statistics.
[21] Minge Xie,et al. A Split-and-Conquer Approach for Analysis of Extraordinarily Large Data , 2014 .
[22] E. Airoldi,et al. Asymptotic and finite-sample properties of estimators based on stochastic gradients , 2014 .
[23] Martin J. Wainwright,et al. Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates , 2013, J. Mach. Learn. Res..
[24] Tong Zhang,et al. Stochastic Optimization with Importance Sampling for Regularized Loss Minimization , 2014, ICML.
[25] Ping Ma,et al. A statistical perspective on algorithmic leveraging , 2013, J. Mach. Learn. Res..
[26] Matthias Katzfuss,et al. A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.
[27] Jelena Kovacevic,et al. A statistical perspective of sampling scores for linear regression , 2015, 2016 IEEE International Symposium on Information Theory (ISIT).
[28] Jing Wu,et al. Online Updating of Statistical Inference in the Big Data Setting , 2015, Technometrics.
[29] Aarti Singh,et al. On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models , 2016, J. Mach. Learn. Res..
[30] Guang Cheng,et al. Computational Limits of A Distributed Algorithm for Smoothing Spline , 2015, J. Mach. Learn. Res..
[31] Jianqing Fan,et al. DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS. , 2018, Annals of statistics.
[32] Min Yang,et al. Information-Based Optimal Subdata Selection for Big Data Linear Regression , 2017, Journal of the American Statistical Association.
[33] Ming-Hui Chen,et al. Online updating method with new variables for big data streams , 2018, The Canadian journal of statistics = Revue canadienne de statistique.
[34] HaiYing Wang,et al. Optimal subsampling for softmax regression , 2019, Statistical Papers.
[35] Rong Zhu,et al. Optimal Subsampling for Large Sample Logistic Regression , 2017, Journal of the American Statistical Association.
[36] Yan Wang,et al. A fast divide-and-conquer sparse Cox regression. , 2018, Biostatistics.
[37] Peter X.-K. Song,et al. Renewable estimation and incremental inference in generalized linear models with streaming data sets , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).
[38] HaiYing Wang,et al. More Efficient Estimation for Logistic Regression with Optimal Subsamples , 2018, J. Mach. Learn. Res..
[39] Wenxuan Zhong,et al. Online Decentralized Leverage Score Sampling for Streaming Multidimensional Time Series , 2019, AISTATS.
[40] HaiYing Wang,et al. Divide-and-Conquer Information-Based Optimal Subdata Selection Algorithm , 2019, Journal of Statistical Theory and Practice.
[41] Tong Zhang,et al. Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression , 2016, The Annals of Statistics.
[42] Yanyuan Ma,et al. Optimal subsampling for quantile regression in big data , 2020, Biometrika.
[43] Mingyao Ai,et al. Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators With Massive Data , 2020, Journal of the American Statistical Association.
[44] J. Tropp,et al. Randomized numerical linear algebra: Foundations and algorithms , 2020, Acta Numerica.
[45] Michael W. Mahoney,et al. Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms , 2020, AISTATS.
[46] Jun Yu,et al. OPTIMAL SUBSAMPLING ALGORITHMS FOR BIG DATA REGRESSIONS , 2018, Statistica Sinica.