Improved Bounds on the Dot Product under Random Projection and Random Sign Projection

Dot product is a key building block in a number of data mining algorithms from classification, regression, correlation clustering, to information retrieval and many others. When data is high dimensional, the use of random projections may serve as a universal dimensionality reduction method that provides both low distortion guarantees and computational savings. Yet, contrary to the optimal guarantees that are known on the preservation of the Euclidean distance cf. the Johnson-Lindenstrauss lemma, the existing guarantees on the dot product under random projection are loose and incomplete in the current data mining and machine learning literature. Some recent literature even suggested that the dot product may not be preserved when the angle between the original vectors is obtuse. In this paper we provide improved bounds on the dot product under random projection that matches the optimal bounds on the Euclidean distance. As a corollary, we elucidate the impact of the angle between the original vectors on the relative distortion of the dot product under random projection, and we show that the obtuse vs. acute angles behave symmetrically in the same way. In a further corollary we make a link to sign random projection, where we generalise earlier results. Numerical simulations confirm our theoretical results. Finally we give an application of our results to bounding the generalisation error of compressive linear classifiers under the margin loss.

[1]  Alexander I. Barvinok Integration and Optimization of Multivariate Polynomials by Restriction onto a Random Subspace , 2007, Found. Comput. Math..

[2]  Kenneth Ward Church,et al.  Improving Random Projections Using Marginal Information , 2006, COLT.

[3]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[4]  Noga Alon,et al.  Problems and results in extremal combinatorics--I , 2003, Discret. Math..

[5]  Adam Tauman Kalai,et al.  Disentangling Gaussians , 2012, Commun. ACM.

[6]  N. Alon Problems and results in Extremal Combinatorics , Part , 2002 .

[7]  Sanjay Chawla,et al.  An incremental data-stream sketch using sparse random projections , 2007, SDM.

[8]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[9]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[10]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[11]  Samuel Kaski,et al.  Dimensionality reduction by random mapping: fast similarity computation for clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[12]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[13]  Sanjoy Dasgupta,et al.  An elementary proof of a theorem of Johnson and Lindenstrauss , 2003, Random Struct. Algorithms.

[14]  Ata Kabán New Bounds on Compressive Linear Least Squares Regression , 2014, AISTATS.

[15]  Dmitriy Fradkin,et al.  Experiments with random projections for machine learning , 2003, KDD '03.

[16]  Ata Kabán,et al.  Sharp Generalization Error Bounds for Randomly-projected Classifiers , 2013, ICML.

[17]  Christos Boutsidis,et al.  Random Projections for Linear Support Vector Machines , 2012, TKDD.

[18]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[19]  Anton van den Hengel,et al.  Is margin preserved after random projection? , 2012, ICML.

[20]  Avrim Blum,et al.  Random Projection, Margins, Kernels, and Feature-Selection , 2005, SLSFS.

[21]  Jon M. Kleinberg,et al.  Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[22]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[23]  Sanparith Marukatat Classification with Sign Random Projections , 2014, PRICAI.

[24]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[25]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[26]  Venu Govindaraju,et al.  An Analysis of Random Projections in Cancelable Biometrics , 2014, 1401.4489.

[27]  Santosh S. Vempala,et al.  Kernels as features: On kernels, margins, and low-dimensional mappings , 2006, Machine Learning.

[28]  Anupam Gupta,et al.  An elementary proof of the Johnson-Lindenstrauss Lemma , 1999 .

[29]  E. Skubalska-Rafajlowicz Neural networks with sigmoidal activation functions—dimension reduction using normal random projection , 2009 .

[30]  Santosh S. Vempala,et al.  An algorithmic theory of learning: Robust concepts and random projection , 1999, Machine Learning.

[31]  R. Calderbank Compressed Learning : Universal Sparse Dimensionality Reduction and Learning in the Measurement Domain , 2009 .

[32]  Kasper Green Larsen,et al.  The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction , 2014, ICALP.

[33]  Ramesh Hariharan,et al.  A Randomized Algorithm for Large Scale Support Vector Learning , 2007, NIPS.

[34]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[35]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[36]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[37]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[38]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[39]  Richard G. Baraniuk,et al.  Detection and estimation with compressive measurements , 2006 .

[40]  ChazelleBernard,et al.  The Fast Johnson-Lindenstrauss Transform and Approximate Nearest Neighbors , 2009 .

[41]  V. Buldygin,et al.  Metric characterization of random variables and random processes , 2000 .

[42]  Yuval Rabani,et al.  Explicit Dimension Reduction and Its Applications , 2011, 2011 IEEE 26th Annual Conference on Computational Complexity.

[43]  A. Siegel Toward a Usable Theory of Chernoff Bounds for Heterogeneous and Partially Dependent Random Variables , 1995 .

[44]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.