An Improved KNN Text Categorization Algorithm Based on Density

The KNN algorithm is a widely used in artificial intelligence field.As a text categorization algorithm,it is simple,effective,and easy to implement.But the time complexity of KNN is directly proportional to the sample size.And the categorization accuracy will decrease in case of training samples uneven distribution.An improved KNN algorithm is proposed to improve the text categorization accuracy by adjusting training sample distribution.It analyzed and reduced the training samples in high distribution density areas.Experiments show that,the algorithm works with lower time complexity,also has better accuracy rate and recall rate than common KNN in text classification.