Evolutionary Multiobjective Ensemble Learning Based on Bayesian Feature Selection

This paper proposes to incorporate evolutionary multiobjective algorithm and Bayesian Automatic Relevance Determination (ARD) to automatically design and train ensemble. The algorithm determines almost all the parameters of ensemble automatically. Our algorithm adopts different feature subsets, selected by Bayesian ARD, to maintain accuracy and promote diversity among individual NNs in an ensemble. The multiobjective evaluation of the fitness of the networks encourages the networks with lower error rate and fewer features. The proposed algorithm is applied to several real-world classification problems and in all cases the performance of the method is better than the performance of other ensemble construction algorithms.

[1]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[2]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[3]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[4]  César Hervás-Martínez,et al.  Cooperative coevolution of artificial neural network ensembles for pattern classification , 2005, IEEE Transactions on Evolutionary Computation.

[5]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[6]  Luiz Eduardo Soares de Oliveira,et al.  Multi-objective Genetic Algorithms to Create Ensemble of Classifiers , 2005, EMO.

[7]  Xin Yao,et al.  Simultaneous training of negatively correlated neural networks in an ensemble , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[8]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[9]  Luiz Eduardo Soares de Oliveira,et al.  A Methodology for Feature Selection Using Multiobjective Genetic Algorithms for Handwritten Digit String Recognition , 2003, Int. J. Pattern Recognit. Artif. Intell..

[10]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[11]  Kalyanmoy Deb,et al.  MULTI-OBJECTIVE FUNCTION OPTIMIZATION USING NON-DOMINATED SORTING GENETIC ALGORITHMS , 1994 .

[12]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[13]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[14]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[15]  Kevin J. Cherkauer Human Expert-level Performance on a Scientiic Image Analysis Task by a System Using Combined Artiicial Neural Networks , 1996 .

[16]  Hiroshi Mamitsuka Empirical evaluation of ensemble feature subset selection methods for learning from a high-dimensional database in drug design , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[19]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[20]  Lars Kai Hansen,et al.  Ensemble methods for handwritten digit recognition , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[21]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[22]  Ferdinand Hergert,et al.  Improving model selection by nonconvergent methods , 1993, Neural Networks.

[23]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[24]  Mykola Pechenizkiy,et al.  Search strategies for ensemble feature selection in medical diagnostics , 2003, 16th IEEE Symposium Computer-Based Medical Systems, 2003. Proceedings..

[25]  C. Sitthi-amorn,et al.  Bias , 1993, The Lancet.

[26]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[27]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[29]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[30]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[31]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[32]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[33]  Xin Yao,et al.  Every Niching Method has its Niche: Fitness Sharing and Implicit Sharing Compared , 1996, PPSN.

[34]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[35]  J. Langford,et al.  FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness , 2000, ICML.

[36]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[37]  Tsuhan Chen,et al.  Pose invariant face recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[38]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[40]  Yuansong Liao,et al.  Constructing Heterogeneous Committees Using Input Feature Grouping: Application to Economic Forecasting , 1999, NIPS.

[41]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[42]  Hussein A. Abbass,et al.  A Memetic Pareto Evolutionary Approach to Artificial Neural Networks , 2001, Australian Joint Conference on Artificial Intelligence.

[43]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[44]  Xin Yao,et al.  DIVACE: Diverse and Accurate Ensemble Learning Algorithm , 2004, IDEAL.

[45]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .