On the asymptotics of random forests

The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains largely unexplained, thereby creating a gap between theory and practice. In this paper, we present some asymptotic results on random forests in a regression framework. Firstly, we provide theoretical guarantees to link finite forests used in practice (with a finite number M of trees) to their asymptotic counterparts (with M = ∞ ). Using empirical process theory, we prove a uniform central limit theorem for a large class of random forest estimates, which holds in particular for Breiman's (2001) original forests. Secondly, we show that infinite forest consistency implies finite forest consistency and thus, we state the consistency of several infinite forests. In particular, we prove that q quantile forests-close in spirit to Breiman's (2001) forests but easier to study-are able to combine inconsistent trees to obtain a final consistent prediction, thus highlighting the benefits of random forests compared to single trees.

[1]  Paul Horton,et al.  Network-based de-noising improves prediction from microarray data , 2006, BMC Bioinformatics.

[2]  Bertrand Michel,et al.  Grouped variable importance with random forests and application to multiple functional data analysis , 2014, Comput. Stat. Data Anal..

[3]  Yee Whye Teh,et al.  Mondrian Forests: Efficient Online Random Forests , 2014, NIPS.

[4]  Trevor J. Hastie,et al.  Confidence intervals for random forests: the jackknife and the infinitesimal jackknife , 2013, J. Mach. Learn. Res..

[5]  Jean-Philippe Vert,et al.  Consistency of Random Forests , 2014, 1405.2881.

[6]  Stéphan Clémençon,et al.  Ranking forests , 2013, J. Mach. Learn. Res..

[7]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[8]  Misha Denil,et al.  Consistency of Online Random Forests , 2013, ICML.

[9]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[10]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[11]  Arnaud Guyader,et al.  New insights into Approximate Bayesian Computation , 2012, 1207.6461.

[12]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[13]  HoTin Kam The Random Subspace Method for Constructing Decision Forests , 1998 .

[14]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[15]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[16]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[17]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Cha Zhang,et al.  Ensemble Machine Learning , 2012 .

[21]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[22]  Robin Genuer,et al.  Random Forests: some methodological insights , 2008, 0811.3619.

[23]  Udaya B. Kogalur,et al.  Consistency of Random Survival Forests. , 2008, Statistics & probability letters.

[24]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[25]  Thomas G. Dietterich,et al.  Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms , 2008 .

[26]  Enea G. Bongiorno,et al.  Contributions in Infinite-Dimensional Statistics and Related Topics , 2014 .

[27]  L. Breiman SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES , 2000 .

[28]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[29]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[30]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[31]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[32]  G. Hooker,et al.  Ensemble Trees and CLTs: Statistical Inference for Supervised Learning , 2014 .

[33]  Jean-Michel Poggi,et al.  Classification supervis\'ee en grande dimension. Application \`a l'agr\'ement de conduite automobile , 2010, 1010.6227.

[34]  Hans Knutsson,et al.  Reinforcement Learning Trees , 1996 .

[35]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[36]  Stefan Wager Asymptotic Theory for Random Forests , 2014, 1405.0352.

[37]  Denis Larocque,et al.  An empirical comparison of ensemble methods based on classification trees , 2003 .

[38]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[39]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[40]  Hemant Ishwaran,et al.  The effect of splitting on random forests , 2014, Machine Learning.

[41]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[42]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[43]  Adele Cutler,et al.  PERT – Perfect Random Tree Ensembles , 2001 .

[44]  Paris Vi,et al.  Analysis of a Random Forests Model , 2010 .

[45]  Philip H. S. Torr,et al.  Randomized trees for human pose detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.