Balancing Bias and Variance: Network Topology and Pattern Set Reduction Techniques

It has been estimated that some 70% of applications of neural networks use some variant of the multi-layer feed-forward network trained using back-propagation. These neural networks are non-parametric estimators, and their limitations can be explained by a well understood problem in non-parametric statistics, being the “bias and variance” dilemma. The dilemma is that to obtain a good approximation of an input-output relationship using some form of estimator, constraints must be placed on the structure of the estimator and hence introduce bias, or a very large number of examples of the relationship must be used to construct the estimator. Thus, we have a trade off between generalisation ability and training time.

[1]  Jeffrey A. Joines,et al.  Improved generalization using robust cost functions , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[2]  Tamás D. Gedeon,et al.  Bimodal Distribution Removal , 1993, IWANN.

[3]  H. White,et al.  A Unified Theory of Estimation and Inference for Nonlinear Dynamic Models , 1988 .

[4]  Tamás D. Gedeon,et al.  An improved technique in porosity prediction: a neural network approach , 1995, IEEE Trans. Geosci. Remote. Sens..

[5]  Ehud D. Karnin,et al.  A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.

[6]  Charles L. Karr,et al.  Determination of lithology from well logs using a neural network , 1992 .

[7]  Dennis Sanger,et al.  Contribution analysis: a technique for assigning responsibilities to hidden units in connectionist networks , 1991 .

[8]  B. E. Segee,et al.  Fault tolerance of pruned multilayer networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[9]  Michael C. Mozer,et al.  Using Relevance to Reduce Network Size Automatically , 1989 .

[10]  Masafumi Hagiwara Novel backpropagation algorithm for reduction of hidden units and acceleration of convergence using artificial selection , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[11]  Tom Gedeon,et al.  Heuristic pattern reduction II , 1993 .

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  P. Ruzicka Neural net configuration design using theory of sensitivity and tolerances , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[14]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[15]  R.J.F. Dow,et al.  Neural net pruning-why and how , 1988, IEEE 1988 International Conference on Neural Networks.

[16]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.