Neural network ensembles: evaluation of aggregation algorithms

Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An important problem is, then, how to tune the aggregate members in order to have an optimal compromise between these two conflicting conditions. We present here an extensive evaluation of several algorithms for ensemble construction, including new proposals and comparing them with standard methods in the literature. We also discuss a potential problem with sequential aggregation algorithms: the non-frequent but damaging selection through their heuristics of particularly bad ensemble members. We introduce modified algorithms that cope with this problem by allowing individual weighting of aggregate members. Our algorithms and their weighted modifications are favorably tested against other methods in the literature, producing a sensible improvement in performance on most of the standard statistical databases used as benchmarks.

[1]  Nathan Intrator,et al.  Boosting Regression Estimators , 1999, Neural Computation.

[2]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[3]  Bruce E. Rosen,et al.  Ensemble Learning Using Decorrelated Neural Networks , 1996, Connect. Sci..

[4]  JOHN G. CARNEY,et al.  Tuning Diversity in Bagged Ensembles , 2000, Int. J. Neural Syst..

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Thomas Richardson,et al.  Boosting methodology for regression problems , 1999, AISTATS.

[7]  J. Davenport Editor , 1960 .

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  D. Obradovic,et al.  Combining Artificial Neural Nets , 1999, Perspectives in Neural Computing.

[10]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[12]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[13]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[14]  Toniann Pitassi,et al.  A Gradient-Based Boosting Algorithm for Regression Problems , 2000, NIPS.

[15]  Pablo M. Granitto,et al.  A Learning Algorithm For Neural Network Ensembles , 2001, Inteligencia Artif..

[16]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  K. Ikeda Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system , 1979 .

[19]  David P. Helmbold,et al.  Leveraging for Regression , 2000, COLT.

[20]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[21]  Pablo M. Granitto,et al.  A Late-Stopping Method for Optimal Aggregation of Neural Networks , 2001, Int. J. Neural Syst..

[22]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[23]  Gunnar Rätsch,et al.  Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.

[24]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[25]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[26]  Paul M. B. Vitányi,et al.  Proceedings of the Second European Conference on Computational Learning Theory , 1995 .

[27]  Nathan Intrator,et al.  Optimal ensemble averaging of neural networks , 1997 .

[28]  John Shawe-Taylor,et al.  Towards a strategy for boosting regressors , 2000 .

[29]  Amanda J. C. Sharkey,et al.  Boosting Using Neural Networks , 1999 .

[30]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..