An Optimized K-Nearest Neighbor Algorithm for Large Scale Hierarchical Text Classification
暂无分享,去创建一个
In this paper, an optimized k nearest neighbor algorithm for the 2nd edition of the Large Scale Hierarchical Text Classification Pascal Challenge was summarized. Firstly, we perform k-NN algorithm on the datasets to obtain the top-k nearest neighbors for each testing documents. Secondly, several critical category-neighbors features were identified and the impact of each of those features were estimated through cross-validation. Finally, the categories prediction algorithm utilizes the optimal parameters for the category-neighbors features to predict the categories for the testing documents. The experiments performed on the three datasets for the challenge show that the classifier can get high accuracy.
[1] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.
[2] Grigorios Tsoumakas,et al. Random K-labelsets for Multilabel Classification , 2022 .
[3] Yiming Yang,et al. An example-based mapping method for text categorization and retrieval , 1994, TOIS.
[4] W. Bruce Croft,et al. Combining classifiers in text categorization , 1996, SIGIR '96.