Kernels for Text Analysis

During past decade, kernel methods have proved to be successful in different text analysis tasks. There are several reasons that make kernel based methods applicable to many real world problems especially in domains where data is not naturally represented in a vector form. Firstly, instead of manual construction of the feature space for the learning task, kernel functions provide an alternative way to design useful features automatically, therefore, allowing very rich representations. Secondly, kernels can be designed to incorporate a. prior knowledge about the domain. This property allows to notably improve performance of the general learning methods and their simple adaptation to the specific problem. Finally, kernel methods are naturally applicable in situations where data representation is not in a vectorial form, thus avoiding extensive preprocessing step. In this chapter, we present the main ideas behind kernel methods in general and kernels for text analysis in particular as well as provide an example of designing feature space for parse ranking problem with different kernel functions.

[1]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[2]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[3]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[4]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[5]  Christina S. Leslie,et al.  Fast String Kernels using Inexact Matching for Protein Sequences , 2004, J. Mach. Learn. Res..

[6]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[7]  Tapio Salakoski,et al.  Locality-Convolution Kernel and Its Application to Dependency Parse Ranking , 2006, IEA/AIE.

[8]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[9]  Nello Cristianini,et al.  Latent Semantic Kernels , 2001, Journal of Intelligent Information Systems.

[10]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[11]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[12]  Tapio Salakoski,et al.  Regularized Least-Squares for Parse Ranking , 2005, IDA.

[13]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[14]  John Lafferty,et al.  Grammatical Trigrams: A Probabilistic Model of Link Grammar , 1992 .

[15]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[16]  M. Kendall Rank Correlation Methods , 1949 .

[17]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[18]  Jean-Michel Renders,et al.  Word-Sequence Kernels , 2003, J. Mach. Learn. Res..