Kernel Methods for Exploratory Pattern Analysis: A Demonstration on Text Data

Kernel Methods are a class of algorithms for pattern analysis with a number of convenient features. They can deal in a uniform way with a multitude of data types and can be used to detect many types of relations in data. Importantly for applications, they have a modular structure, in that any kernel function can be used with any kernel-based algorithm. This means that customized solutions can be easily developed from a standard library of kernels and algorithms. This paper demonstrates a case study in which many algorithms and kernels are mixed and matched, for a cross-language text analysis task. All the software is available online.

[1]  Gunnar Rätsch,et al.  A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.

[2]  David Haussler,et al.  Using the Fisher Kernel Method to Detect Remote Protein Homologies , 1999, ISMB.

[3]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[5]  Jean-Philippe Vert,et al.  Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA , 2002, NIPS.

[6]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[7]  Tijl De Bie,et al.  Eigenproblems in Pattern Recognition , 2005 .

[8]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[9]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[10]  Nello Cristianini,et al.  Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis , 2002, NIPS.

[11]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[12]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[13]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[14]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[15]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[16]  Christina S. Leslie,et al.  Fast Kernels for Inexact String Matching , 2003, COLT.

[17]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.