论文信息 - A study with multi-word feature with text classification

A study with multi-word feature with text classification

We carried out a series of experiments on text classification using multi-word features. A hand-crafted method was proposed to extract the multi-words from text data set and two different strategies were developed to normalize the multi-words into two different versions of multi-word features. After the texts were represented respectively using these two different multi-word features, text classification was conducted in contrast to examine the effectiveness of these two strategies. Also the linear and nonlinear polynomial kernel of support vector machine (SVM) was compared on the performance of text classification task.

Zhang Wen | Xijin Tang | Taketoshi Yoshida

[1] Sholom M. Weiss,et al. Automated learning of decision rules for text categorization , 1994, TOIS.

[2] Tong Zhang,et al. Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[3] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.

[4] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[5] Vladimir Cherkassky,et al. Vapnik-Chervonenkis (VC) learning theory and its applications , 1999 .

[6] Underhill Moore,et al. Learning theory and its application. , 1943 .

[7] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8] Steffen Staab,et al. Ontologies improve text document clustering , 2003, Third IEEE International Conference on Data Mining.

[9] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.