How to Fake Multiply by a Gaussian Matrix

Have you ever wanted to multiply an $n \times d$ matrix $X$, with $n \gg d$, on the left by an $m \times n$ matrix $\tilde G$ of i.i.d. Gaussian random variables, but could not afford to do it because it was too slow? In this work we propose a new randomized $m \times n$ matrix $T$, for which one can compute $T \cdot X$ in only $O(\text{nnz}(X)) + \tilde O(m^{1.5} \cdot d^{3})$ time, for which the total variation distance between the distributions $T \cdot X$ and $\tilde G \cdot X$ is as small as desired, i.e., less than any positive constant. Here $\text{nnz}(X)$ denotes the number of non-zero entries of $X$. Assuming $\text{nnz}(X) \gg m^{1.5} \cdot d^{3}$, this is a significant savings over the naive $O(\text{nnz}(X) m)$ time to compute $\tilde G \cdot X$. Moreover, since the total variation distance is small, we can provably use $T \cdot X$ in place of $\tilde G \cdot X$ in any application and have the same guarantees as if we were using $\tilde G \cdot X$, up to a small positive constant in error probability. We apply this transform to nonnegative matrix factorization (NMF) and support vector machines (SVM).

[1]  Nicolas Gillis,et al.  Using underapproximations for sparse nonnegative matrix factorization , 2009, Pattern Recognit..

[2]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[3]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[4]  Ali Taylan Cemgil,et al.  Generalised Coupled Tensor Factorisation , 2011, NIPS.

[5]  Vikas Sindhwani,et al.  Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization , 2012, ICML.

[6]  Christos Boutsidis,et al.  Random Projections for Linear Support Vector Machines , 2012, TKDD.

[7]  Joel A. Tropp,et al.  Factoring nonnegative matrices with linear programs , 2012, NIPS.

[8]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[9]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[10]  David F. Gleich,et al.  Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices , 2014, NIPS.

[11]  Joel A. Tropp,et al.  Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[12]  References , 1971 .

[13]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[14]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[15]  Nicolas Gillis,et al.  Fast and Robust Recursive Algorithmsfor Separable Nonnegative Matrix Factorization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[17]  G. Brumfiel High-energy physics: Down the petabyte highway , 2011, Nature.

[18]  Stephen A. Vavasis,et al.  On the Complexity of Nonnegative Matrix Factorization , 2007, SIAM J. Optim..

[19]  Sivan Toledo,et al.  Blendenpik: Supercharging LAPACK's Least-Squares Solver , 2010, SIAM J. Sci. Comput..

[20]  Guillermo Sapiro,et al.  Compressed Nonnegative Matrix Factorization Is Fast and Accurate , 2015, IEEE Transactions on Signal Processing.

[21]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[22]  M. C. U. Araújo,et al.  The successive projections algorithm for variable selection in spectroscopic multicomponent analysis , 2001 .

[23]  Sanjeev Arora,et al.  Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[24]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[25]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.