论文信息 - Using clustering to enhance text classification

Using clustering to enhance text classification

This paper addresses the problem of learning to classify textsby exploiting information derived from clustering both training and testing sets. The incorporation of knowledge resulting from clustering into the feature space representation of the texts is expected to boost the performance of a classifier. Experiments conducted on several widely used datasets demonstrate the effectiveness of the proposed algorithm especially for small training sets.

Theodore Kalamboukis | Antonia Kyriakopoulou | T. Kalamboukis | A. Kyriakopoulou

[1] T. Kalamboukis,et al. Text Classification Using Clustering , 2006 .

[2] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[3] George Karypis,et al. CLUTO - A Clustering Toolkit , 2002 .

[4] Hongjun Lu,et al. CBC: clustering based text classification requiring minimal labeled data , 2003, Third IEEE International Conference on Data Mining.

[5] Adam Kowalczyk,et al. Using Unlabelled Data for Text Classification through Addition of Cluster Parameters , 2002, International Conference on Machine Learning.

[6] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[7] Ran El-Yaniv,et al. Distributional Word Clusters vs. Words for Text Categorization , 2003, J. Mach. Learn. Res..