Ranking the Rules and Instances of Decision Trees

Traditionally, decision trees rank instances by using the local probability estimations for each leaf node. The instances in the same leaf node will be estimated with equal probabilities. In this paper, we propose a hierarchical ranking strategy by combining decision trees and leaf weighted Naive Bayes to improve the local probability estimation for a leaf node. We consider the importance of the rules, and then rank the instances fit in with the rules. Because the probability estimations based on Naive Bayes might be poor, we investigate some different techniques which were proposed to modify Naive Bayes as well. Experiments show that our proposed method has significantly better performance than that of other methods according to paired t-test. All results are evaluated by using AUC (Area under ROC Curve) instead of classification accuracy.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[3]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[4]  Gregory F. Cooper,et al.  Model Averaging for Prediction with Discrete Bayesian Networks , 2004, J. Mach. Learn. Res..

[5]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[6]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[7]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[8]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[9]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[10]  Peter A. Flach,et al.  Improving the AUC of Probabilistic Estimation Trees , 2003, ECML.

[11]  øöö Blockinøø Well-Trained PETs : Improving Probability Estimation , 2000 .

[12]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[13]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[14]  Thomas Gärtner,et al.  WBCsvm: Weighted Bayesian Classification based on Support Vector Machines , 2001, ICML.

[15]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[16]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[17]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[18]  Stephan Bernard,et al.  Ranking Cases with Decision Trees: a Geometric Method that Preserves Intelligibility , 2005, IJCAI.

[19]  C. Ling,et al.  Decision Tree with Better Ranking , 2003, ICML.

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[21]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[22]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.