A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization

In this work, we present a new bottom-up algorithm for decision tree pruning that is very e cient (requiring only a single pass through the given tree), and prove a strong performance guarantee for the generalization error of the resulting pruned tree. We work in the typical setting in which the given tree T may have been derived from the given training sample S, and thus may badly over t S. In this setting, we give bounds on the amount of additional generalization error that our pruning su ers compared to the optimal pruning of T . More generally, our results show that if there is a pruning of T with small error, and whose size is small compared to jSj, then our algorithm will nd a pruning whose error is not much larger. This style of result has been called an index of resolvability result by Barron and Cover in the context of density estimation. A novel feature of our algorithm is its locality | the decision to prune a subtree is based entirely on properties of that subtree and the sample reaching it. To analyze our algorithm, we develop tools of local uniform convergence, a generalization of the standard notion that may prove useful in other settings.