An empirical study on ensemble selection for class-imbalance data sets

The algorithm of GASEN (Genetic Algorithm based Selective Ensemble Network) has been proven to be a very effective way to select a subset of neural networks to form an ensemble classifier or a regressor of enhanced generation ability. And yet performance of GASEN on class-imbalance data sets hasn't been discussed widely, while class-imbalance learning itself is an increasingly important issue. In this paper, an improved solution of GASEN is proposed to handle this kind of problem where research achievements from class-imbalance learning field is employed.

[1]  Luiz Eduardo Soares de Oliveira,et al.  The implication of data diversity for a classifier-free ensemble selection in random subspaces , 2008, 2008 19th International Conference on Pattern Recognition.

[2]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[3]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[5]  J. Friedman Stochastic gradient boosting , 2002 .

[6]  Robert Sabourin,et al.  K-Nearest Oracle for Dynamic Ensemble Selection , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Robert Sabourin,et al.  Single and Multi-Objective Genetic Algorithms for the Selection of Ensemble of Classifiers , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[9]  Jianying Hu,et al.  Winning the KDD Cup Orange Challenge with Ensemble Selection , 2009, KDD Cup.

[10]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.