Learning with Algebraic Invariances, and the Invariant Kernel Trick

When solving data analysis problems it is important to integrate prior knowledge and/or structural invariances. This paper contributes by a novel framework for incorporating algebraic invariance structure into kernels. In particular, we show that algebraic properties such as sign symmetries in data, phase independence, scaling etc. can be included easily by essentially performing the kernel trick twice. We demonstrate the usefulness of our theory in simulations on selected applications such as sign-invariant spectral clustering and underdetermined ICA.

[1]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[2]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jianbo Shi,et al.  Learning Segmentation by Random Walks , 2000, NIPS.

[4]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[5]  Christian Walder,et al.  Learning with Transformation Invariant Kernels , 2007, NIPS.

[6]  Tony Jebara Convex Invariance Learning , 2003, AISTATS.

[7]  Burt Totaro,et al.  Hilbert’s 14th problem over finite fields and a conjecture on the cone of curves , 2008, Compositio Mathematica.

[8]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[9]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[10]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[11]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[12]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[13]  I. Kondor,et al.  Group theoretical methods in machine learning , 2008 .

[14]  Gang Niu,et al.  Information-Maximization Clustering Based on Squared-Loss Mutual Information , 2014, Neural Computation.

[15]  Klaus-Robert Müller,et al.  Inlier‐based ICA with an application to superimposed images , 2005, Int. J. Imaging Syst. Technol..

[16]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[17]  Hans Burkhardt,et al.  Invariant kernel functions for pattern analysis and machine learning , 2007, Machine Learning.

[18]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[19]  Robert Jenssen,et al.  Kernel Entropy Component Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Dante Mantini,et al.  A K-means multivariate approach for clustering independent components from magnetoencephalographic data , 2012, NeuroImage.

[21]  Hao Jiang,et al.  Correlation Kernels for Support Vector Machines Classification with Applications in Cancer Data , 2012, Comput. Math. Methods Medicine.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[25]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[26]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[27]  Bernhard Schölkopf,et al.  Incorporating Invariances in Non-Linear Support Vector Machines , 2001, NIPS.

[28]  Deborah Crook Polynomial Invariants of the Euclidean Group Action on Multiple Screws , 2009 .

[29]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[30]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[31]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[32]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[33]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..