论文信息 - Faster Ridge Regression via the Subsampled Randomized Hadamard Transform

Faster Ridge Regression via the Subsampled Randomized Hadamard Transform

We propose a fast algorithm for ridge regression when the number of features is much larger than the number of observations (p ≫ n). The standard way to solve ridge regression in this setting works in the dual space and gives a running time of O(n2p). Our algorithm Subsampled Randomized Hadamard Transform- Dual Ridge Regression (SRHT-DRR) runs in time O(np log(n)) and works by preconditioning the design matrix by a Randomized Walsh-Hadamard Transform with a subsequent subsampling of features. We provide risk bounds for our SRHT-DRR algorithm in the fixed design setting and show experimental results on synthetic and real datasets.

[1] Isabelle Guyon,et al. Design of experiments for the NIPS 2003 variable selection benchmark , 2003 .

[2] W. Massy. Principal Components Regression in Exploratory Statistical Research , 1965 .

[3] Mark Tygert,et al. A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[4] S. Muthukrishnan,et al. Faster least squares approximation , 2007, Numerische Mathematik.

[5] Nathan Halko,et al. An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..

[6] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[7] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[8] Alexander Gammerman,et al. Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[9] Nir Ailon,et al. Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes , 2008, SODA '08.

[10] Sham M. Kakade,et al. Analysis of a randomized approximation scheme for matrix multiplication , 2012, ArXiv.

[11] Joel A. Tropp,et al. Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[12] J. Marron,et al. PCA CONSISTENCY IN HIGH DIMENSION, LOW SAMPLE SIZE CONTEXT , 2009, 0911.3827.

[13] V. Rokhlin,et al. A fast randomized algorithm for overdetermined linear least-squares regression , 2008, Proceedings of the National Academy of Sciences.

[14] Alexander J. Smola,et al. Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[15] Tamás Sarlós,et al. Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[16] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[17] Christos Boutsidis,et al. Improved Matrix Algorithms via the Subsampled Randomized Hadamard Transform , 2012, SIAM J. Matrix Anal. Appl..

[18] Francis R. Bach,et al. Sharp analysis of low-rank kernel matrix approximations , 2012, COLT.

[19] Mark Tygert,et al. A fast algorithm for computing minimal-norm solutions to underdetermined systems of linear equations , 2009, ArXiv.

[20] Sham M. Kakade,et al. A risk comparison of ordinary least squares vs ridge regression , 2011, J. Mach. Learn. Res..

[21] Michael A. Saunders,et al. LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems , 2011, SIAM J. Sci. Comput..

[22] Bernard Chazelle,et al. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.