Averaging over decision trees

Pruning a decision tree is considered by some researchers to be the most important part of tree building in noisy domains. While there are many approaches to pruning, the alternative of averaging over decision trees has not received as much attention. The basic idea of tree averaging is to produce a weighted sum of decisions. We consider the set of trees used for the averaging process, and how weights should be assigned to each tree in this set. We define the concept of afanned set for a tree, and examine how the Minimum Message Length paradigm of learning may be used to average over decision trees. We perform an empirical evaluation of two averaging approaches, and a Minimum Message Length approach.

[1]  G. W. Snedecor Statistical Methods , 1964 .

[2]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[3]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[4]  Statistical methods , 1980 .

[5]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[6]  Chris Carter,et al.  Multiple decision trees , 2013, UAI.

[7]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[8]  Lalit R. Bahl,et al.  A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  Edward J. Delp,et al.  An Iterative Growing and Pruning Algorithm for Classification Tree Design , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[12]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[13]  D J Hand,et al.  Statistical methods in diagnosis , 1992, Statistical methods in medical research.

[14]  Daryl Pregibon,et al.  Tree-based models , 1992 .

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[16]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.