Graph-Based Model-Selection Framework for Large Ensembles

The intuition behind ensembles is that different prediciton models compensate each other's errors if one combines them in an appropriate way In case of large ensembles a lot of different prediction models are available However, many of them may share similar error characteristics, which highly depress the compensation effect Thus the selection of an appropriate subset of models is crucial In this paper, we address this problem As major contribution, for the case if a large number of models is present, we propose a graph-based framework for model selection while paying special attention to the interaction effect of models In this framework, we introduce four ensemble techniques and compare them to the state-of-the-art in experiments on publicly available real-world data.

[1]  Lars Schmidt-Thieme,et al.  Ensembles of relational classifiers , 2008, Knowledge and Information Systems.

[2]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[3]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[4]  Guo-Zheng Li,et al.  Feature Selection for Bagging of Support Vector Machines , 2006, PRICAI.

[5]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[6]  Ian H. Witten,et al.  Stacked generalization: when does it work? , 1997, IJCAI 1997.

[7]  Geoffrey I. Webb,et al.  PRICAI 2006: Trends in Artificial Intelligence, 9th Pacific Rim International Conference on Artificial Intelligence, Guilin, China, August 7-11, 2006, Proceedings , 2006, PRICAI.

[8]  Yonghong Peng,et al.  A novel ensemble machine learning for robust microarray data classification , 2006, Comput. Biol. Medicine.

[9]  Geoffrey I. Webb,et al.  To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParent-One-Dependence Estimators , 2007, IEEE Transactions on Knowledge and Data Engineering.

[10]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[11]  Fabio Roli,et al.  Multiple Classifier Systems, 9th International Workshop, MCS 2010, Cairo, Egypt, April 7-9, 2010. Proceedings , 2010, MCS.

[12]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[13]  Antanas Verikas,et al.  A feature selection technique for generation of classification committees and its application to categorization of laryngeal images , 2009, Pattern Recognit..

[14]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[15]  Alexey Tsymbal,et al.  Ensemble feature selection with the simple Bayesian classification , 2003, Inf. Fusion.

[16]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.