Behavioral Diversity and a Probabilistically Optimal GP Ensemble

We propose N-version Genetic Programming (NVGP) as an ensemble method to enhance accuracy and reduce performance fluctuation of programs produced by genetic programming. Diversity is essential for forming successful ensembles. NVGP quantifies behavioral diversity of ensemble members and defines NVGP optimal as an ensemble that has independent fault occurrences among its members. We observed significant accuracy improvement by NVGP optimal ensembles when applied to a DNA segment classification problem.

[1]  William B. Langdon,et al.  Genetic programming for combining classifiers , 2001 .

[2]  Anders Gorm Pedersen,et al.  Investigations of Escherichia coli Promoter Sequences with Artificial Neural Networks: New Signals Discovered Upstream of the Transcriptional Startpoint , 1995, ISMB.

[3]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[4]  Les Hatton,et al.  N-Version Design vs. One Good Version , 1997, IEEE Softw..

[5]  Algirdas Avizienis,et al.  Fault Tolerance by Design Diversity: Concepts and Experiments , 1984, Computer.

[6]  Farokh B. Bastani,et al.  Diversity in the software development process , 1997, Proceedings Third International Workshop on Object-Oriented Real-Time Dependable Systems.

[7]  Gunnar Rätsch,et al.  An Improvement of AdaBoost to Avoid Overfitting , 1998, ICONIP.

[8]  Bruce E. Rosen,et al.  Ensemble Learning Using Decorrelated Neural Networks , 1996, Connect. Sci..

[9]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[10]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[11]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[12]  Terence Soule,et al.  Abstention Reduces Errors - decision Abstaining N-version Genetic Programming , 2002, GECCO.

[13]  H. Iba Bagging, Boosting, and bloating in Genetic Programming , 1999 .

[14]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[15]  Wolfgang Banzhaf,et al.  Explicit Control of Diversity and Effective Variation Distance in Linear Genetic Programming , 2002, EuroGP.

[16]  Bruce W. Schmeiser,et al.  Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.

[17]  Anikó Ekárt,et al.  Maintaining the Diversity of Genetic Programs , 2002, EuroGP.

[18]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[19]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[20]  Robert Feldt,et al.  Generating multiple diverse software versions with genetic programming , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[21]  Terence Soule,et al.  N-Version Genetic Programming via Fault Masking , 2002, EuroGP.

[22]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[23]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[24]  Wolfgang Banzhaf,et al.  Evolving Teams of Predictors with Linear Genetic Programming , 2001, Genetic Programming and Evolvable Machines.

[25]  James A. Foster,et al.  Fault-tolerant computing with N-version genetic programming , 2001 .

[26]  Terence Soule Heterogeneity and Specialization in Evolving Teams , 2000, GECCO.

[27]  D. Jimenez,et al.  Dynamically weighted ensemble neural networks for classification , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[28]  S. Handley Predicting whether or not a nucleic acid sequence is an E. coli promoter region using genetic programming , 1995, Proceedings First International Symposium on Intelligence in Neural and Biological Systems. INBS'95.

[29]  Graham Kendall,et al.  A Survey And Analysis Of Diversity Measures In Genetic Programming , 2002, GECCO.

[30]  Robert Feldt,et al.  Generating diverse software versions with genetic programming: and experimental study , 1998, IEE Proc. Softw..

[31]  Joseph Y. Lo,et al.  New results in breast cancer classification obtained from an evolutionary computation/adaptive boosting hybrid using mammogram and history data , 2001, SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504).

[32]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[33]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[34]  Yishay Mansour,et al.  Why averaging classifiers can protect against overfitting , 2001, AISTATS.

[35]  David W. Opitz,et al.  Hazard assessment modeling: an evolutionary ensemble approach , 1999 .