PERT – Perfect Random Tree Ensembles

Ensemble classifiers originated in the machine learning community. They work by fitting many individual classifiers and combining them by weighted or unweighted voting. The ensemble classifier is often much more accurate than the individual classifiers from which it is built. In fact, ensemble classifiers are among the most accurate general-purpose classifiers available. We introduce a new ensemble method, PERT, in which each individual classifier is a perfectly-fit classification tree with random selection of splits. Compared to other ensemble methods, PERT is very fast to fit. Given the randomness of the split selection, PERT is surprisingly accurate. Calculations suggest that one reason why PERT works so well is that although the individual tree classifiers are extremely weak, they are almost uncorrelated. The simple probabilistic nature of the classifier lends itself to theoretical analysis. We show that PERT is fitting a continuous posterior probability surface for each class. As such, it can be viewed as a classification-via-regression procedure that fits a continuous interpolating surface. In theory, this surface could be found using a one-shot procedure.