论文信息 - Virtual Examples for Text Classification with Support Vector Machines

Virtual Examples for Text Classification with Support Vector Machines

We explore how virtual examples (artificially created examples) improve performance of text classification with Support Vector Machines (SVMs). We propose techniques to create virtual examples for text classification based on the assumption that the category of a document is unchanged even if a small number of words are added or deleted. We evaluate the proposed methods by Reuters-21758 test set collection. Experimental results show virtual examples improve the performance of text classification with SVMs, especially for small training sets.

Manabu Sassano | Manabu Sassano

[1] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2] Bernhard Schölkopf,et al. Training Invariant Support Vector Machines , 2002, Machine Learning.

[3] Sebastian Thrun,et al. Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.

[4] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[5] Yiming Yang,et al. An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[6] Yuji Matsumoto,et al. Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[7] Bernhard Schölkopf,et al. Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[8] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[9] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[10] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[11] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.