Analytical Interpretation of Feed-Forward Nets outputs after Training

The minimization quadratic error criterion which gives rise to the back-propagation algorithm is studied using functional analysis techniques. With them, we recover easily the well-known statistical result which states that the searched global minimum is a function which assigns, to each input pattern, the expected value of its corresponding output patterns. Its application to classification tasks shows that only certain output class representations can be used to obtain the optimal Bayesian decision rule. Finally, our method permits the study of other error criterions, finding out, for instance, that absolute value errors lead to medians instead of mean values.

[1]  René Alquézar,et al.  Improvement of Learning in Recurrent Networks by Substituting the Sigmoid Activation Function , 1994 .

[2]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[3]  Anastasios N. Venetsanopoulos,et al.  Artificial neural networks - learning algorithms, performance evaluation, and applications , 1992, The Kluwer international series in engineering and computer science.

[4]  Tobias J. Hagge,et al.  Physics , 1929, Nature.

[5]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[6]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[7]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[8]  H. Jeffreys A Treatise on Probability , 1922, Nature.

[9]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[10]  Eric A. Wan,et al.  Neural network classification: a Bayesian interpretation , 1990, IEEE Trans. Neural Networks.

[11]  F. James Statistical Methods in Experimental Physics , 1973 .

[12]  L. M. M.-T. Theory of Probability , 1929, Nature.

[13]  C. Jacoboni W. T. Eadie, D. Dryard, F. E. James, M. Roos e B. Sadoulet — Statistical methods in experimental physics , 1977 .

[14]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[15]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[16]  Lluís Garrido,et al.  Use of Neural Nets to Measure The? Polarization and its Bayesian Interpretation , 1991, Int. J. Neural Syst..

[17]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[18]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[19]  David E. Rumelhart,et al.  BACK-PROPAGATION, WEIGHT-ELIMINATION AND TIME SERIES PREDICTION , 1991 .