Optimal linear combinations of neural networks: an overview

Neural networks based modeling often involves trying multiple networks with different architectures and/or training parameters in order to achieve acceptable model accuracy. Typically, one of the trained NNs is chosen as best, while the rest are discarded. Hashem and Schmeiser (1992) propose using optimal linear combinations of a number of trained neural networks instead of using a single best network. In this paper, we discuss and extend the idea of optimal linear combinations of neural networks. Optimal linear combinations are constructed by forming weighted sums of the corresponding outputs of the networks. The combination-weights are selected to minimize the mean squared error with respect to the distribution of random inputs. Combining the trained networks may help integrate the knowledge acquired by the component networks and thus improve model accuracy. We investigate some issues concerning the estimation of the optimal combination-weights and the role of the optimal linear combination in improving model accuracy for both well-trained and poorly trained component networks. Experimental results based on simulated data are included. For our examples, the model accuracy resulting from using estimated optimal linear combinations is better than that of the best trained network and that of the simple averaging of the outputs of the component networks.<<ETX>>

[1]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[2]  Johannes R. Sveinsson,et al.  Parallel consensual neural networks , 1993, IEEE International Conference on Neural Networks.

[3]  R. Clemen Combining forecasts: A review and annotated bibliography , 1989 .

[4]  Bruce W. Schmeiser,et al.  Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.

[5]  Sherif Hashem Bruce Schmeiser Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedfo , 1993 .

[6]  Leon N. Cooper Hybrid neural network architectures: equilibrium systems that pay attention , 1992 .

[7]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[8]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  M. Perrone Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization , 1993 .

[10]  Ganesh Mani Lowering Variance of Decisions by Using Artificial Neural Network Portfolios , 1991, Neural Computation.

[11]  C. Granger Invited review combining forecasts—twenty years later , 1989 .

[12]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[13]  C. Granger,et al.  Improved methods of combining forecasts , 1984 .

[14]  William G. Baxt,et al.  Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks , 1992, Neural Computation.

[15]  Ethem Alpaydin,et al.  Multiple networks for function learning , 1993, IEEE International Conference on Neural Networks.

[16]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[17]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .