Control of Variables in Reducts - kNN Classification with Confidence

Reduct in rough set is a minimal subset of features, which has almost the same discernible power as the entire features. Then, there are relations between reducts and the classification classes. Here, we propose multiple reducts which are followed by the k-nearest neighbor with confidence to classify documents with higher classification accuracy. To improve the classification accuracy, some reducts are needed for the classification. Then, control of variables as attributes are important for the classification. To select better reducts for the classification, a greedy algorithm is developed here for the classification, which is based on the selection of useful attributes These proposed methods are verified to be effective in the classification on benchmark datasets from the Reuters 21578 data set.

[1]  Frederic Maire,et al.  Intelligent Data Engineering and Automated Learning - IDEAL 2005, 6th International Conference, Brisbane, Australia, July 6-8, 2005, Proceedings , 2005, IDEAL.

[2]  Padraig Cunningham,et al.  Generating Estimates of Classification Confidence for a Case-Based Spam Filter , 2005, ICCBR.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  B.F. Momin,et al.  Reduct Generation and Classification of Gene Expression Data , 2006, 2006 International Conference on Hybrid Information Technology.

[5]  Naohiro Ishii,et al.  Classification by Instance-Based Learning Algorithm , 2005, IDEAL.

[6]  Barry Smyth,et al.  Advances in Case-Based Reasoning , 1996, Lecture Notes in Computer Science.

[7]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[8]  Naohiro Ishii,et al.  A rough set-based hybrid method to text categorization , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[9]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[10]  Joseph Price,et al.  Measures of Solution Accuracy in Case-Based Reasoning Systems , 2004, ECCBR.

[11]  Z. Pawlak,et al.  Rough set approach to multi-attribute decision analysis , 1994 .

[12]  Andrzej Skowron,et al.  Decision Algorithms: A Survey of Rough Set - Theoretic Methods , 1997, Fundam. Informaticae.

[13]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.