Building forests of local trees

A novel approach in the field of classifier ensembles is proposed.The approach uses an ensemble of random decision trees.Each decision tree is trained on a different area of the input space.Areas can overlap and a good coverage of the input space is ensured.Experimental results confirm the validity of the approach. Ensemble methods have shown to be more effective than monolithic classifiers, in particular when diversity holds among their components. How to enforce diversity in classifier ensembles has received much attention from machine learning researchers, yielding a variety of different techniques and algorithms. In this paper, a novel algorithm for ensemble classifiers is proposed, in which ensemble components are trained with focus on different regions of the sample space. In so doing, diversity is mainly a consequence of the intention to limit the scope of base classifiers. The algorithm proposed in this paper shares roots with several ensemble paradigms, in particular with random forests, as it generates forests of decision trees as well. As decision trees are trained with focus on specific subsets of the sample space, the resulting ensemble is in fact a forest of local trees. Comparative experimental results highlight that, on average, these ensembles perform better than other relevant kinds of ensemble classifiers, including random forests.

[1]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[2]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[3]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[5]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[7]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[8]  C. List,et al.  Epistemic democracy : generalizing the Condorcet jury theorem , 2001 .

[9]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[10]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[13]  Piotr Fryzlewicz,et al.  Random Rotation Ensembles , 2016, J. Mach. Learn. Res..

[14]  Tony R. Martinez,et al.  Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[15]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Kuo-Wei Hsu,et al.  A Theoretical Analysis of Why Hybrid Ensembles Work , 2017, Comput. Intell. Neurosci..

[19]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[20]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[21]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[22]  Giuliano Armano,et al.  NXCS Experts for Financial Time Series Forecasting , 2004 .

[23]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[26]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[27]  Fang Liu,et al.  Random subspace based ensemble sparse representation , 2018, Pattern Recognit..

[28]  Yogendra P. Chaubey,et al.  Resampling Methods: A Practical Guide to Data Analysis , 2000, Technometrics.

[29]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[30]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[31]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[32]  Giuliano Armano,et al.  Random Prototype-based Oracle for Selection-fusion Ensembles , 2010, 2010 20th International Conference on Pattern Recognition.

[33]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[34]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007 .

[35]  Giuliano Armano,et al.  Mixture of Random Prototype-Based Local Experts , 2010, HAIS.

[36]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[37]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[38]  John F. Kolen,et al.  Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..

[39]  Gavin Brown,et al.  Learn++.MF: A random subspace approach for the missing feature problem , 2010, Pattern Recognit..

[40]  Juan José Rodríguez Diez,et al.  Classifier Ensembles with a Random Linear Oracle , 2007, IEEE Transactions on Knowledge and Data Engineering.

[41]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[42]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[43]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[44]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[45]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[46]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[47]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[48]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .