Backpropagation uses prior information efficiently

The ability of neural net classifiers to deal with a priori information is investigated. For this purpose, backpropagation classifiers are trained with data from known distributions with variable a priori probabilities, and their performance on separate test sets is evaluated. It is found that backpropagation employs a priori information in a slightly suboptimal fashion, but this does not have serious consequences on the performance of the classifier. Furthermore, it is found that the inferior generalization that results when an excessive number of network parameters are used can (partially) be ascribed to this suboptimality.

[1]  M. J. D. Powell,et al.  Restart procedures for the conjugate gradient method , 1977, Math. Program..

[2]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[4]  Etienne Barnard,et al.  A comparison between criterion functions for linear classifiers, with an application to neural nets , 1989, IEEE Trans. Syst. Man Cybern..

[5]  Etienne Barnard Performance and generalization of the classification figure of merit criterion function , 1991, IEEE Trans. Neural Networks.

[6]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[7]  Eric B. Baum,et al.  On the capabilities of multilayer perceptrons , 1988, J. Complex..

[8]  Waibel A novel objective function for improved phoneme recognition using time delay neural networks , 1989 .

[9]  Alexander H. Waibel,et al.  A novel objective function for improved phoneme recognition using time delay neural networks , 1990, International 1989 Joint Conference on Neural Networks.

[10]  Eric A. Wan,et al.  Neural network classification: a Bayesian interpretation , 1990, IEEE Trans. Neural Networks.

[11]  Ronald A. Cole,et al.  Location and classification of plosive consonants using expert knowledge and neural net classifiers , 1988 .

[12]  H. White,et al.  Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.