Improved Class Probability Estimates from Decision Tree Models

Decision tree models typically give good classification decisions but poor probability estimates. In many applications, it is important to have good probability estimates as well. This chapter introduces a new algorithm, Bagged Lazy Option Trees (B-LOTs), for constructing decision trees and compares it to an alternative, Bagged Probability Estimation Trees (B-PETs). The quality of the class probability estimates produced by the two methods is evaluated in two ways. First, we compare the ability of the two methods to make good classification decisions when the misclassification costs are asymmetric. Second, we compare the absolute accuracy of the estimates themselves. The experiments show that B-LOTs produce better decisions and more accurate probability estimates than B-PETs.

[1]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[2]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[3]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[4]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[5]  Wray L. Buntine,et al.  A theory of learning classification rules , 1990 .

[6]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[7]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[8]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[9]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[10]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[11]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[12]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[13]  Carla E. Brodley,et al.  Pruning Decision Trees with Misclassification Costs , 1998, ECML.

[14]  Adrian F. M. Smith,et al.  A Bayesian CART algorithm , 1998 .

[15]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[18]  øöö Blockinøø Well-Trained PETs : Improving Probability Estimation , 2000 .

[19]  Thomas G. Dietterich,et al.  Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers , 2000, ICML.

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.