Hownet-based conceptual feature selection method

Feature selection of documents is an important issue in text filtering. However, the lack of semantic information in document representation is a great disadvantage of word feature. This paper presents a novel method of semantic based feature selection on the basis of vector space model which takes Hownet as its semantic repository. This method can better represent the conceptual feature of texts than simple words, improve the system performance, meanwhile decrease the dimension of text vector to reduce the load of computation and improve the filtering efficiency. The experiment results on our Chinese text filtering system which integrated the method has sufficiently proved its effect.