Text Categorization with Support Vector Machines: Learning with Many Relevant Features

This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task. Empirical results support the theoretical findings. SVMs achieve substantial improvements over the currently best performing methods and behave robustly over a variety of different learning tasks. Furthermore they are fully automatic, eliminating the need for manual parameter tuning.

[1]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[2]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Vijay V. Raghavan,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989, TOIS.

[5]  Philip J. Hayes,et al.  CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories , 1990, IAAI.

[6]  Evon M. O. Abu-Taieh,et al.  Comparative study , 2003, BMJ : British Medical Journal.

[7]  G Salton,et al.  Developments in Automatic Text Retrieval , 1991, Science.

[8]  David L. Waltz,et al.  Classifying news stories using memory based reasoning , 1992, SIGIR '92.

[9]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[12]  Yoav Shoham,et al.  Learning Information Retrieval Agents: Experiments with Automated Web Browsing , 1995 .

[13]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[14]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[15]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[16]  Andreas S. Weigend,et al.  A neural network approach to topic spotting , 1995 .

[17]  Manfred K. Warmuth,et al.  The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.

[18]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[19]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[20]  Yiming Yang,et al.  Using Corpus Statistics to Remove Redundant Words in Text Categorization , 1996, J. Am. Soc. Inf. Sci..

[21]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[22]  T. Joachims WebWatcher : A Tour Guide for the World Wide Web , 1997 .

[23]  Manfred K. Warmuth,et al.  The Perceptron Algorithm Versus Winnow: Linear Versus Logarithmic Mistake Bounds when Few Input Variables are Relevant (Technical Note) , 1997, Artif. Intell..

[24]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[25]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[26]  John Shawe-Taylor,et al.  Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[27]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.