On the selection of decision trees in Random Forests

In this paper we present a study on the Random Forest (RF) family of ensemble methods. In a “classical” RF induction process a fixed number of randomized decision trees are inducted to form an ensemble. This kind of algorithm presents two main drawbacks : (i) the number of trees has to be fixed a priori (ii) the interpretability and analysis capacities offered by decision tree classifiers are lost due to the randomization principle. This kind of process in which trees are independently added to the ensemble, offers no guarantee that all those trees will cooperate effectively in the same committee. This statement rises two questions : are there any decision trees in a RF that provide the deterioration of ensemble performance? If so, is it possible to form a more accurate committee via removal of decision trees with poor performance? The answer to these questions is tackled as a classifier selection problem. We thus show that better subsets of decision trees can be obtained even using a sub-optimal classifier selection method. This proves that “classical” RF induction process, for which randomized trees are arbitrary added to the ensemble, is not the best approach to produce accurate RF classifiers. We also show the interest in designing RF by adding trees in a more dependent way than it is traditionally done in “classical” RF induction algorithms.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Ludmila I. Kuncheva,et al.  That Elusive Diversity in Classifier Ensembles , 2003, IbPRIA.

[3]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[5]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[7]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[8]  Olivier Debeir,et al.  Limiting the Number of Trees in Random Forests , 2001, Multiple Classifier Systems.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Adele Cutler,et al.  PERT – Perfect Random Tree Ensembles , 2001 .

[11]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12]  Laurent Heutte,et al.  Using Random Forests for Handwritten Digit Recognition , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[13]  Marko Robnik-Sikonja,et al.  Improving Random Forests , 2004, ECML.

[14]  Hiroshi Sako,et al.  Comparison of genetic algorithm and sequential search methods for classifier subset selection , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[15]  Josef Kittler,et al.  Fast branch & bound algorithms for optimal feature selection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Fabio Roli,et al.  Methods for Designing Multiple Classifier Systems , 2001, Multiple Classifier Systems.

[17]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  L. Breiman Arcing Classifiers , 1998 .

[20]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[23]  Laurent Heutte,et al.  Influence of Hyperparameters on Random Forest Accuracy , 2009, MCS.

[24]  Gian Luca Foresti,et al.  Meta Random Forests , 2006 .