ANALYSIS AND CONSTRUCTION OF WORD WEIGHING FUNCTION IN VSM

Text classification is the basis and core of text mining, and plays an important rule in traditional information retrieval, construction of website architecture, and search for web information. It has become a hot research project in recent years. In this paper, the hypostasis of VSM (vector space model), a kind of frequently-used classical text classification model, is analyzed to find the reason for its low classification precision, and a weight adjustment method is put forward in which the IDF function is replaced by evaluation function used in feature selection. Also made are theoretic analysis and experimental comparison with the performance of weight adjustment using various evaluation functions. And a novel approach to construct a new high-powered evaluation function is presented.