Estimating Prediction Certainty in Decision Trees

Decision trees estimate prediction certainty using the class distribution in the leaf responsible for the prediction. We introduce an alternative method that yields better estimates. For each instance to be predicted, our method inserts the instance to be classified in the training set with one of the possible labels for the target attribute; this procedure is repeated for each one of the labels. Then, by comparing the outcome of the different trees, the method can identify instances that might present some difficulties to be correctly classified, and attribute some uncertainty to their prediction. We perform an extensive evaluation of the proposed method, and show that it is particularly suitable for ranking and reliability estimations. The ideas investigated in this paper may also be applied to other machine learning techniques, as well as combined with other methods for prediction certainty estimation.

[1]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[2]  Han Liang,et al.  Improve Decision Trees for Probability-Based Ranking by Lazy Learners , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[3]  Eyke Hüllermeier,et al.  Why Fuzzy Decision Trees are Good Rankers , 2009, IEEE Transactions on Fuzzy Systems.

[4]  Igor Kononenko,et al.  Reliable Classifications with Machine Learning , 2002, ECML.

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[7]  C. Ling,et al.  Decision Tree with Better Ranking , 2003, ICML.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[10]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[11]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[12]  Peter A. Flach,et al.  Improving the AUC of Probabilistic Estimation Trees , 2003, ECML.

[13]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[14]  Bin Wang,et al.  Improving the Ranking Performance of Decision Trees , 2006, ECML.

[15]  Tapio Elomaa,et al.  Machine Learning: ECML 2002 , 2002, Lecture Notes in Computer Science.

[16]  Hendrik Blockeel,et al.  Machine Learning: ECML 2003 , 2003, Lecture Notes in Computer Science.

[17]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[18]  David D. Denison,et al.  Nonlinear estimation and classification , 2003 .

[19]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[20]  Thomas G. Dietterich,et al.  Improved Class Probability Estimates from Decision Tree Models , 2003 .