Data-independent Random Projections from the feature-space of the homogeneous polynomial kernel

Abstract Performing a Random Projection from the feature space associated to a kernel function may be important for two main reasons. (1) As a consequence of the Johnson–Lindestrauss lemma, the resulting low-dimensional representation will preserve most of the structure of data in the kernel feature space and (2) an efficient linear classifier trained on transformed data might approximate the accuracy of its nonlinear counterparts. In this paper, we present a novel method to perform Random Projections from the feature space of homogeneous polynomial kernels. As opposed to other kernelized Random Projection proposals, our method focuses on a specific kernel family to preserve some of the beneficial properties of the original Random Projection algorithm (e.g. data independence and efficiency). Our extensive experimental results evidence that the proposed method efficiently approximates a Random Projection from the kernel feature space, preserving pairwise distances and enabling a boost on linear classification accuracies.

[1]  Avrim Blum,et al.  Random Projection, Margins, Kernels, and Feature-Selection , 2005, SLSFS.

[2]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[3]  Santosh S. Vempala,et al.  Kernels as features: On kernels, margins, and low-dimensional mappings , 2006, Machine Learning.

[4]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[5]  Brian C. Lovell,et al.  Random projections on manifolds of Symmetric Positive Definite matrices for image classification , 2014, IEEE Winter Conference on Applications of Computer Vision.

[6]  Calton Pu,et al.  Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically , 2006, CEAS.

[7]  Juan M. Corchado,et al.  Data-independent Random Projections from the feature-map of the homogeneous polynomial kernel of degree two , 2018, Inf. Sci..

[8]  Zexi Hu,et al.  Extended compressed tracking via random projection based on MSERs and online LS-SVM learning , 2016, Pattern Recognit..

[9]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[10]  Ata Kabán,et al.  Improved Bounds on the Dot Product under Random Projection and Random Sign Projection , 2015, KDD.

[11]  Sanjoy Dasgupta,et al.  An elementary proof of a theorem of Johnson and Lindenstrauss , 2003, Random Struct. Algorithms.

[12]  Nanning Zheng,et al.  Exemplar-Guided Similarity Learning on Polynomial Kernel Feature Map for Person Re-identification , 2017, International Journal of Computer Vision.

[13]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[14]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[16]  F. Lad,et al.  Approximating the Distribution for Sums of Products of Normal Variables , 2003 .

[17]  Andrew Zisserman,et al.  Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Changshui Zhang,et al.  A faster cutting plane algorithm with accelerated line search for linear SVM , 2017, Pattern Recognit..

[19]  Oscar Fontenla-Romero,et al.  Online Machine Learning , 2024, Machine Learning: Foundations, Methodologies, and Applications.

[20]  Nilanjan Ray,et al.  Robust people counting using sparse representation and random projection , 2015, Pattern Recognit..

[21]  O. Kallenberg Foundations of Modern Probability , 2021, Probability Theory and Stochastic Modelling.

[22]  Ângelo Cardoso,et al.  Iterative random projections for high-dimensional data clustering , 2012, Pattern Recognit. Lett..

[23]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[24]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[25]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[26]  Jian Xun Peng,et al.  A sequential algorithm for sparse support vector classifiers , 2013, Pattern Recognit..

[27]  Brian C. Lovell,et al.  Efficient clustering on Riemannian manifolds: A kernelised random projection approach , 2015, Pattern Recognit..

[28]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[29]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[30]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[31]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32]  Chih-Jen Lin,et al.  Training and Testing Low-degree Polynomial Data Mappings via Linear SVM , 2010, J. Mach. Learn. Res..

[33]  David A. Clausi,et al.  Sorted random projections for robust rotation-invariant texture classification , 2012, Pattern Recognit..

[34]  Brian C. Lovell,et al.  Kernelised orthonormal random projection on grassmann manifolds with applications to action and gait-based gender recognition , 2015, IEEE International Conference on Identity, Security and Behavior Analysis (ISBA 2015).

[35]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[36]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.