Improved Text Feature Selection Method Based on Text Feature Weight

This paper compared several feature selection methods in text categorization,and proposed a new feature selection method(TFIDF_Ci) based on weighted frequency of distinction between the text.It improves TFIDF function from weighted frequency and the feature items can increase the ability of text categorization in documents.In the experiment,we tested the effect of this feature selection method and other feature selection methods by using KNN classifiers.The experiments show the new method has good performance and stability under different numbers of training sets.