Maximizing Tree Diversity by Building Complete-Random Decision Trees

One of the ways to lower generalization error of decision tree ensemble is to maximize tree diversity. Building complete-random trees forgoes strength obtained from a test selection criterion. However, it achieves higher tree diversity. We provide a taxonomy of different randomization methods and find that complete-random test selection produces diverse trees and other randomization methods such as bootstrap sampling may impair tree growth and limit tree diversity. The well accepted practice in constructing decision trees is to apply bootstrap sampling and voting. To challenge this practice, we explore eight variants of complete-random trees using three parameters: ensemble methods, tree height restriction and sample randomization. Surprisingly, the most accurate variant is very simple and performs comparably to Bagging and Random Forests. It achieves good results by maximizing tree diversity and is called Max-diverse Ensemble.