Latent Semantic Kernels

Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches.Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost.In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space.We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.

[1]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[2]  P. C. Wong,et al.  Generalized vector spaces model in information retrieval , 1985, SIGIR '85.

[3]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[4]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[5]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[6]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[7]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[8]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[9]  Jean Paul Ballerini,et al.  Experiments in multilingual information retrieval using the SPIDER system , 1996, SIGIR '96.

[10]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[11]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[12]  Susan T. Dumais,et al.  Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.

[13]  John Shawe-Taylor,et al.  Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[14]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[15]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[16]  Gunnar Rätsch,et al.  Kernel PCA pattern reconstruction via approximate pre-images. , 1998 .

[17]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[18]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[19]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[20]  Bernhard Schölkopf,et al.  SV Estimation of a Distribution's Support , 1999, NIPS 1999.

[21]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[22]  Fan Jiang,et al.  Approximate Dimension Equalization in Vector-based Information Retrieval , 2000, ICML.

[23]  Nello Cristianini,et al.  Margin Distribution and Soft Margin , 2000 .

[24]  Florence d'Alché-Buc,et al.  Support Vector Machines based on a semantic kernel for text categorization , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[25]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[26]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[27]  P. Bartlett,et al.  Gaussian Processes and SVM: Mean Field and Leave-One-Out , 2000 .

[28]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[29]  Bernhard Schölkopf,et al.  Sparse Kernel Feature Analysis , 2002 .

[30]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[31]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[32]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.