Robust combining methods in committee neural networks

Combining a set of suitable experts can improve the generalization performance of the group when compared to single experts alone. The classical problem in this area is to answer the question about how to combine the ensemble members or the individuals. Different methods for combining the outputs of the experts in a committee machine (ensemble) are reported in the literature. The popular method to determine the error in every prediction is Mean Square Error (MSE), which is heavily influenced by outliers that can be found in many real data such as geosciences data. In this paper we introduce Robust Committee Neural Networks (RCNNs). Our proposed approach is the Huber and Bisquare function to determine the error between measured and predicted value which is less influenced by outliers. Therefore, we have used a Genetic Algorithm (GA) method to combine the individuals with the Huber and Bisquare as the fitness functions. The results show that the Root Mean Square Error (RMSE) and R-square values for these two functions are improved compared to the MSE as the fitness function and the proposed combiner outperformed other five existing training algorithms.

[1]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  David W. Opitz,et al.  Actively Searching for an E(cid:11)ective Neural-Network Ensemble , 1996 .

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[5]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[6]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[8]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[9]  Moumen T. El-Melegy,et al.  Robust Training of Artificial Feedforward Neural Networks , 2009, Foundations of Computational Intelligence.

[10]  Carlos Soares,et al.  A Comparison of Ranking Methods for Classification Algorithm Selection , 2000, ECML.

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[13]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[14]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[15]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[16]  David W. Opitz,et al.  An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.

[17]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[18]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[20]  Maria Petrou,et al.  Use of Dempster-Shafer theory to combine classifiers which use different class boundaries , 2003, Pattern Analysis & Applications.

[21]  Sung-Bae Cho,et al.  An HMM/MLP Architecture for Sequence Recognition , 1995, Neural Computation.

[22]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[23]  M. Hill,et al.  Data analysis in community and landscape ecology , 1987 .

[24]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[25]  F. Klawonn,et al.  Fuzzy Cluster Analysis from the Viewpoint of Robust Statistics , 2009 .

[26]  Antony Browne,et al.  Neural network ensembles: combining multiple models for enhanced performance using a multistage approach , 2004, Expert Syst. J. Knowl. Eng..

[27]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[28]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[29]  Peter J. Huber,et al.  Wiley Series in Probability and Statistics , 2011 .

[30]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[31]  Bruce W. Schmeiser,et al.  Optimal linear combinations of neural networks: an overview , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[32]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .