Text Categorization Method Based on Improved KNN
暂无分享,去创建一个
In view of the inadequacy of K nearest neighborhood (KNN) algorithm in text-processing environment in vector space models,this paper puts forward an improved KNN method of text categorization in accordance with self-organization mapping neutral network theory(SOM),feature selection theory and pattern aggregation theory.This paper employs feature selection theory and pattern aggregation theory to reduce feature space dimension.And because each dimension of VSM models possesses the same weight,which is not suitable for text-processing environment,this paper suggests applying SOM neutral network to calculate the weight of each dimension of VSM models.Combining the two improvements,this paper efficiently reduces the dimensions of vector space and raises accuracy and speed of text categorization.