When Does Overfitting Decrease Prediction Accuracy in Induced Decision Trees and Rule Sets?

Researchers studying classification techniques based on induced decision trees and rule sets have found that the model which best fits training data is unlikely to yield optimal performance on fresh data. Such a model is typically overfitted, in the sense that it captures not only true regularities reflected in the training data, but also chance patterns which have no significance for classification and, in fact, reduce the model's predictive accuracy. Various simplification methods have been shown to help avoid overfitting in practice. Here, through detailed analysis of a paradigmatic example, I attempt to uncover the conditions under which these techniques work as expected. One auxilliary result of importance is identification of conditions under which overfitting does not decrease predictive accuracy and hence in which it would be a mistake to apply simplification techniques, if predictive accuracy is the key goal.