论文信息 - Parametric Statistical Estimation with Artificial Neural Networks: A Condensed Discussion

Parametric Statistical Estimation with Artificial Neural Networks: A Condensed Discussion

Learning in artificial neural networks is a process by which experience arising from exposure to measurements of empirical phenomena is converted to knowledge, embodied in network weights. This process can be viewed formally as statistical estimation of the parameters of a parametrized probability model. We exploit this formal viewpoint to give a unified theory of learng in artificial neural networks. The theory encompasses both supervised and unsupervised learning in either feedforward or recurrent networks. We begin by describing various objects appropriate for learning, such as conditional means, variances or quantiles, or conditional densities. We then show how artificial neural networks can be viewed as parametric statistical models directed toward these objects of interest. We show how a probability density can be associated with the output of any network, and use this density to define network weights indexing an information-theoretically optimal approximation to the object of interest. We next study statistical properties of quasi-maximum likelihood estimators consistent for the optimal weights, including issues associated with statistical inference about the optimal weights. Finally, we consider computational methods for obtaining these estimators, with special attention to extensions of the method of back-propagation.

Halbert White | H. White

[1] E. S. Pearson,et al. On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[2] Claude E. Shannon,et al. The mathematical theory of communication , 1950 .

[3] R. A. Leibler,et al. On Information and Sufficiency , 1951 .

[4] R. Hogg. Some Observations on Robust Estimation , 1967 .

[5] F. Hampel. Contributions to the theory of robust estimation , 1968 .

[6] R. R. Bahadur,et al. Some asymptotic properties of likelihood ratios on general sample spaces , 1972 .

[7] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[8] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[9] F. Hampel. The Influence Curve and Its Role in Robust Estimation , 1974 .

[10] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[11] Peter Craven,et al. Smoothing noisy data with spline functions , 1978 .