Classification algorithm based on semantics and text feature weighting

Text categorization faces the problems of dimensionality curse,noise data and different classification contributions for different feature words.In order to improve text classification accuracy,this paper presented a new approach to data processing.The approach first removed the noise data,and then employed feature extraction algorithms and semantic analysis methods to implement dimensionality reduction.Different weights were assigned to different text features based on a semantic similarity evaluation.The processed data were used to construct classifiers.Experimental results show that the text processing method can effectively improve the accuracy of text classification.