A Genetic Algorithm Approach for Creating Neural-Network Ensembles

A neural-network ensemble is a successful technique where the outputs of a set of separately trained neural networks are combined to form one unified prediction. An effective ensemble should consist of a set of networks that are not only highly correct, but ones that make their errors on different parts of the input space as well; however, most existing techniques only indirectly address the problem of creating such a set. We present an algorithm called Addemup that uses genetic algorithms to explicitly search for a highly diverse set of accurate trained networks. Addemup works by first creating an initial population, then uses genetic operators to continually create new networks, keeping the set of networks that are highly accurate while disagreeing with each other as much as possible. Experiments on four real-world domains show that Addemup is able to generate a set of trained networks that is more accurate than several existing ensemble approaches. Experiments also show that Addemup is able to effectively incorporate prior knowledge, if available, to improve the quality of its ensemble.

[1]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[2]  M. P. Perrone,et al.  A soft-competitive splitting rule for adaptive tree-structured neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[3]  Bruce W. Schmeiser,et al.  Optimal linear combinations of neural networks: an overview , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[4]  David W. Opitz,et al.  Using Genetic Search to Refine Knowledge-based Neural Networks , 1994, ICML.

[5]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[6]  David W. Opitz,et al.  An anytime approach to connectionist theory refinement - refining the topologies of knowledge-based neural networks , 1996, Technical Report / University of Wisconsin, Madison / Computer Sciences Department.

[7]  Ethem Alpaydin,et al.  Multiple networks for function learning , 1993, IEEE International Conference on Neural Networks.

[8]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[9]  Jude W. Shavlik,et al.  Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks , 1995, IJCAI.

[10]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[12]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[13]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[14]  Terrence J. Sejnowski,et al.  Filter Selection Model for Generating Visual Motion Signals , 1992, NIPS.

[15]  Ganesh Mani Lowering Variance of Decisions by Using Artificial Neural Network Portfolios , 1991, Neural Computation.

[16]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[17]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[18]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[19]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[20]  R. Clemen Combining forecasts: A review and annotated bibliography , 1989 .

[21]  David W. Opitz,et al.  Heuristically Expanding Knowledge-Based Neural Networks , 1993, IJCAI.

[22]  Volker Tresp,et al.  Combining Estimators Using Non-Constant Weighting Functions , 1994, NIPS.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[25]  William G. Baxt,et al.  Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks , 1992, Neural Computation.

[26]  Anders Krogh,et al.  Learning with ensembles: How overfitting can be useful , 1995, NIPS.

[27]  Pat Langley,et al.  Machine learning as an experimental science , 2004, Machine Learning.

[28]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[29]  David W. Opitz,et al.  An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.

[30]  David S. Touretzky,et al.  Learning with Ensembles: How Over--tting Can Be Useful , 1996 .

[31]  Harris Drucker,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[32]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .