Sparse sketches with small inversion bias
暂无分享,去创建一个
Zhenyu Liao | Edgar Dobriban | Michael W. Mahoney | Michal Derezi'nski | Zhenyu Liao | Michal Derezinski | Edgar Dobriban
[1] Tengyao Wang,et al. Sparse principal component analysis via axis‐aligned random projections , 2017, Journal of the Royal Statistical Society: Series B (Statistical Methodology).
[2] Manfred K. Warmuth,et al. Leveraged volume sampling for linear regression , 2018, NeurIPS.
[3] Michael B. Cohen,et al. Nearly Tight Oblivious Subspace Embeddings by Trace Inequalities , 2016, SODA.
[4] Kurt Keutzer,et al. ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning , 2020, AAAI.
[5] J. Hartigan. Linear Bayesian Methods , 1969 .
[6] Shusen Wang,et al. GIANT: Globally Improved Approximate Newton Method for Distributed Optimization , 2017, NeurIPS.
[7] David P. Woodruff,et al. Low rank approximation and regression in input sparsity time , 2012, STOC '13.
[8] David P. Woodru. Sketching as a Tool for Numerical Linear Algebra , 2014 .
[9] Dechang Chen,et al. The Theory of the Design of Experiments , 2001, Technometrics.
[10] Santosh S. Vempala,et al. The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.
[11] Huy L. Nguyen,et al. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[12] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[13] Kenneth L. Clarkson,et al. Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression , 2019, COLT.
[14] R. Couillet,et al. Random Matrix Methods for Wireless Communications , 2011 .
[15] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[16] Richard A. Davis,et al. Time Series: Theory and Methods , 2013 .
[17] Manfred K. Warmuth,et al. Unbiased estimators for random design regression , 2019, ArXiv.
[18] V. Rokhlin,et al. A fast randomized algorithm for the approximation of matrices ✩ , 2007 .
[19] Edgar Dobriban,et al. Ridge Regression: Structure, Cross-Validation, and Sketching , 2020, ICLR.
[20] Volkan Cevher,et al. Practical Sketching Algorithms for Low-Rank Matrix Approximation , 2016, SIAM J. Matrix Anal. Appl..
[21] Jianqing Fan,et al. An Overview of the Estimation of Large Covariance and Precision Matrices , 2015, The Econometrics Journal.
[22] R. Paley,et al. A note on analytic functions in the unit circle , 1932, Mathematical Proceedings of the Cambridge Philosophical Society.
[23] Cameron Musco,et al. Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.
[24] S. Muthukrishnan,et al. Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..
[25] D. Burkholder. Distribution Function Inequalities for Martingales , 1973 .
[26] David P. Woodruff,et al. How to Reduce Dimension With PCA and Random Projections? , 2020, IEEE Transactions on Information Theory.
[27] Calyampudi Radhakrishna Rao,et al. Linear Statistical Inference and its Applications , 1967 .
[28] Michael W. Mahoney,et al. A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares , 2014, J. Mach. Learn. Res..
[29] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[30] Yang Liu,et al. Fast Relative-Error Approximation Algorithm for Ridge Regression , 2015, UAI.
[31] Per-Gunnar Martinsson,et al. Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.
[32] Gideon S. Mann,et al. Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.
[33] V. Marčenko,et al. DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .
[34] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..
[35] T. Cai,et al. A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.
[36] Eric C. Chi,et al. Stable Estimation of a Covariance Matrix Guided by Nuclear Norm Penalties , 2013, Comput. Stat. Data Anal..
[37] Carl D. Meyer,et al. Matrix Analysis and Applied Linear Algebra , 2000 .
[39] S. Muthukrishnan,et al. Sampling algorithms for l2 regression and applications , 2006, SODA '06.
[40] Michael W. Mahoney,et al. Distributed estimation of the inverse Hessian by determinantal averaging , 2019, NeurIPS.
[41] M. Rudelson,et al. Hanson-Wright inequality and sub-gaussian concentration , 2013 .
[42] Manfred K. Warmuth,et al. Reverse iterative volume sampling for linear regression , 2018, J. Mach. Learn. Res..
[43] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[44] Martin J. Wainwright,et al. Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence , 2015, SIAM J. Optim..
[45] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.
[46] Shusen Wang,et al. Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging , 2017, ICML.
[47] Olivier Ledoit,et al. Nonlinear Shrinkage Estimation of Large-Dimensional Covariance Matrices , 2011, 1207.5322.
[48] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[49] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[50] Jianqing Fan,et al. Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.
[51] Edgar Dobriban,et al. Asymptotics for Sketching in Least Squares Regression , 2018, NeurIPS.
[52] C. Tracy,et al. Introduction to Random Matrices , 1992, hep-th/9210073.
[53] Daniele Calandriello,et al. Exact sampling of determinantal point processes with sublinear time preprocessing , 2019, NeurIPS.
[54] H. En. Lower Bounds for Oblivious Subspace Embeddings , 2013 .
[55] Michael W. Mahoney,et al. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression , 2012, STOC '13.
[56] Daniele Calandriello,et al. Sampling from a k-DPP without looking at all items , 2020, NeurIPS.
[57] T. Tao. Topics in Random Matrix Theory , 2012 .
[58] Tamás Sarlós,et al. Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[59] J. L. Roux. An Introduction to the Kalman Filter , 2003 .
[60] E. Dobriban,et al. Distributed linear regression by averaging , 2018, The Annals of Statistics.
[61] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[62] N. L. Johnson,et al. Linear Statistical Inference and Its Applications , 1966 .
[63] Moses Charikar,et al. Finding frequent items in data streams , 2002, Theor. Comput. Sci..
[64] Jean-Philippe Bouchaud,et al. Cleaning large correlation matrices: tools from random matrix theory , 2016, 1610.08104.
[65] Michael W. Mahoney,et al. RandNLA , 2016, Commun. ACM.
[66] Michael W. Mahoney,et al. Determinantal Point Processes in Randomized Numerical Linear Algebra , 2020, Notices of the American Mathematical Society.
[67] Michal Derezinski,et al. Fast determinantal point processes via distortion-free intermediate sampling , 2018, COLT.
[68] Alfred O. Hero,et al. $l_{0}$ Sparse Inverse Covariance Estimation , 2014, IEEE Transactions on Signal Processing.
[69] Dean P. Foster,et al. Faster Ridge Regression via the Subsampled Randomized Hadamard Transform , 2013, NIPS.
[70] Alan M. Frieze,et al. Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.
[71] Bernard Chazelle,et al. The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..
[72] M. Yuan,et al. Model selection and estimation in the Gaussian graphical model , 2007 .
[73] Nathan Halko,et al. An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..
[74] Martin J. Wainwright,et al. A More Powerful Two-Sample Test in High Dimensions using Random Projection , 2011, NIPS.
[75] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[76] Yuchen Zhang,et al. DiSCO: Distributed Optimization for Self-Concordant Empirical Loss , 2015, ICML.
[77] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..
[78] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.
[79] Michael W. Mahoney,et al. Exact expressions for double descent and implicit regularization via surrogate random design , 2019, NeurIPS.
[80] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.
[81] Teodoro Collin. RANDOM MATRIX THEORY , 2016 .
[82] Michael Jackson,et al. Optimal Design of Experiments , 1994 .
[83] N. Meinshausen,et al. High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.
[84] Michael W. Mahoney,et al. PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).
[85] Parikshit Shah,et al. Sketching Sparse Matrices, Covariances, and Graphs via Tensor Products , 2015, IEEE Transactions on Information Theory.
[86] Petros Drineas,et al. Lectures on Randomized Numerical Linear Algebra , 2017, IAS/Park City Mathematics Series.
[87] David Ruppert,et al. RAPTT: An Exact Two-Sample Test in High Dimensions Using Random Projections , 2014, 1405.1792.
[88] Martin J. Wainwright,et al. Randomized sketches of convex programs with sharp guarantees , 2014, 2014 IEEE International Symposium on Information Theory.
[89] Petros Drineas,et al. FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .
[90] J. W. Silverstein,et al. Spectral Analysis of Large Dimensional Random Matrices , 2009 .
[91] Yueqi Sheng,et al. One-shot distributed ridge regression in high dimensions , 2019, ICML.
[92] Bernard Chazelle,et al. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.
[93] Ping Ma,et al. A statistical perspective on algorithmic leveraging , 2013, J. Mach. Learn. Res..
[94] Gideon S. Mann,et al. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.
[95] R. Samworth,et al. Random‐projection ensemble classification , 2015, 1504.04595.
[96] R. Tibshirani,et al. Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.
[97] Mert Pilanci,et al. Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization , 2020, NeurIPS.
[98] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[99] V. Rokhlin,et al. A randomized algorithm for the approximation of matrices , 2006 .
[100] Anja Vogler,et al. An Introduction to Multivariate Statistical Analysis , 2004 .
[101] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[102] Marko Znidaric,et al. Asymptotic Expansion for Inverse Moments of Binomial and PoissonDistributions , 2005, math/0511226.
[103] S. Muthukrishnan,et al. Faster least squares approximation , 2007, Numerische Mathematik.
[104] Martin J. Wainwright,et al. Iterative Hessian Sketch: Fast and Accurate Solution Approximation for Constrained Least-Squares , 2014, J. Mach. Learn. Res..
[105] John C. Duchi,et al. Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[106] Michael W. Mahoney. Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..