Invariant kernel functions for pattern analysis and machine learning

Abstract In many learning problems prior knowledge about pattern variations can be formalized and beneficially incorporated into the analysis system. The corresponding notion of invariance is commonly used in conceptionally different ways. We propose a more distinguishing treatment in particular in the active field of kernel methods for machine learning and pattern analysis. Additionally, the fundamental relation of invariant kernels and traditional invariant pattern analysis by means of invariant representations will be clarified. After addressing these conceptional questions, we focus on practical aspects and present two generic approaches for constructing invariant kernels. The first approach is based on a technique called invariant integration. The second approach builds on invariant distances. In principle, our approaches support general transformations in particular covering discrete and non-group or even an infinite number of pattern-transformations. Additionally, both enable a smooth interpolation between invariant and non-invariant pattern analysis, i.e. they are a covering general framework. The wide applicability and various possible benefits of invariant kernels are demonstrated in different kernel methods.

[1]  JEFFREY WOOD,et al.  Invariant pattern recognition: A review , 1996, Pattern Recognit..

[2]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[3]  Nuno Vasconcelos,et al.  Multiresolution Tangent Distance for Affine-invariant Classification , 1997, NIPS.

[4]  Gunnar Rätsch,et al.  Invariant Feature Extraction and Classification in Kernel Spaces , 1999, NIPS.

[5]  Leonidas J. Guibas,et al.  Discrete Geometric Shapes: Matching, Interpolation, and Approximation , 2000, Handbook of Computational Geometry.

[6]  Bernard Haasdonk,et al.  Tangent distance kernels for support vector machines , 2002, Object recognition supported by user interaction for service robots.

[7]  F. Fleuret,et al.  Scale-Invariance of Support Vector Machines based on the Triangular Kernel , 2001 .

[8]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[9]  Christopher J. C. Burges,et al.  Geometry and invariance in kernel based methods , 1999 .

[10]  Hans Burkhardt,et al.  Invariance in Kernel Methods by Haar-Integration Kernels , 2005, SCIA.

[11]  Remco C. Veltkamp,et al.  Shape matching: similarity measures and algorithms , 2001, Proceedings International Conference on Shape Modeling and Applications.

[12]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[13]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[14]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[15]  Samy Bengio,et al.  Tangent vector kernels for invariant image classification with SVMs , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[16]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[17]  K. Brown,et al.  Graduate Texts in Mathematics , 1982 .

[18]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[19]  Ralf Herbrich,et al.  Learning Kernel Classifiers , 2001 .

[20]  Bernard Haasdonk,et al.  Transformation knowledge in pattern analysis with kernel methods: distance and integration kernels , 2006 .

[21]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[23]  Thomas G. Dietterich,et al.  In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[24]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[25]  Ioannis Pitas,et al.  Nonlinear Model-Based Image/Video Processing and Analysis , 2001 .

[26]  H. Alt Discrete Geometric Shapes Matching Interpolation and Approximation A Survey , 2009 .

[27]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[28]  Alexander J. Smola,et al.  Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes , 2007, International Journal of Computer Vision.

[29]  Glenn Fung,et al.  Knowledge-Based Support Vector Machine Classifiers , 2002, NIPS.

[30]  Hanns Schulz-Mirbach Constructing invariant features by averaging techniques , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[31]  Mehryar Mohri,et al.  Rational Kernels , 2002, NIPS.

[32]  Reiner Lenz Group Theoretical Feature Extraction: Weighted Invariance and Texture Analysis , 1992 .

[33]  C. Watkins Dynamic Alignment Kernels , 1999 .

[34]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[35]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .

[36]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[37]  Hermann Ney,et al.  Experiments with an extended tangent distance , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[38]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[39]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[40]  Hermann Ney,et al.  Local context in non-linear deformation models for handwritten character recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[41]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[42]  David G. Stork,et al.  Pattern Classification , 1973 .

[43]  J. Sack,et al.  Handbook of computational geometry , 2000 .

[44]  Thore Graepel,et al.  Invariant Pattern Recognition by Semi-Definite Programming Machines , 2003, NIPS.

[45]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[46]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[47]  Andrew Zisserman,et al.  Applications of Invariance in Computer Vision , 1993, Lecture Notes in Computer Science.

[48]  I. Schur,et al.  Vorlesungen über Invariantentheorie , 1968 .

[49]  Alexander J. Smola,et al.  Invariances in Classification: an efficient SVM implementation , 2005 .

[50]  Thomas L. Griffiths,et al.  Advances in Neural Information Processing Systems 21 , 1993, NIPS 2009.

[51]  Katharina Morik,et al.  Learning with Non-Positive Semidefinite Kernels , 2008 .

[52]  Todd K. Leen,et al.  From Data Distributions to Regularization in Invariant Learning , 1995, Neural Computation.

[53]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[54]  Alexei Pozdnoukhov,et al.  Tangent vector kernels for invariant image classification with SVMs , 2004, ICPR 2004.

[55]  C. Berg,et al.  Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .

[56]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Nikolaos Canterakis,et al.  3D Zernike Moments and Zernike Affine Invariants for 3D Image Analysis and Recognition , 1999 .

[58]  Bernhard Schölkopf,et al.  Dynamic Alignment Kernels , 2000 .

[59]  富永 昌治 Scandinavian Conference on Image Analysis Gjovik Color Imaging Symposium参加報告 , 2009 .

[60]  Yann LeCun,et al.  Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation , 1996, Neural Networks: Tricks of the Trade.

[61]  Toshihide Ibaraki,et al.  Knowledge based support vector machines , 2005 .

[62]  Michael Werman,et al.  Similarity and Affine Invariant Distances Between 2D Point Sets , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Andrew W. Fitzgibbon,et al.  Joint manifold distance: a new approach to appearance based clustering , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[64]  Hanns Schulz-Mirbach,et al.  Anwendung von Invarianzprinzipien zur Merkmalgewinnung in der Mustererkennung , 1995 .

[65]  Jean-Stéphane Varré,et al.  The Transformation Distance , 1997 .

[66]  Thorsten Joachims,et al.  Estimating the Generalization Performance of an SVM Efficiently , 2000, ICML.

[67]  Bernhard Schölkopf,et al.  Incorporating invariances in nonlinear Support Vector Machines , 2001, NIPS 2001.

[68]  Bernhard Schölkopf,et al.  Prior Knowledge in Support Vector Kernels , 1997, NIPS.

[69]  O. Ronneberger,et al.  Using transformation knowledge for the classification of Raman spectra of biological samples , 2006 .

[70]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[71]  L. Nachbin,et al.  The Haar integral , 1965 .