SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES

To dispel some of the mystery about what makes tree ensembles work, they are looked at in distribution space i.e. the limit case of "infinite" sample size. It is shown that the simplest kind of trees are complete in D-dimensional space if the number of terminal nodes T is greater than D. For such trees we show that the Adaboost minimization algorithm gives an ensemble converging to the Bayes risk. Random forests which are grown using i.i.d random vectors in the tree construction are shown to be equivalent to a kernel acting on the true margin. The form of this kernel is derived for purely random tree growing and its properties explored. The notions of correlation and strength for random forests is reflected in the symmetry and skewness of the kernel