A Probabilistic Computational Framework for Neural Network Models

Abstract : Information retrieval in a 'connectionist' or neural network is viewed as computing the most probable value of the information to be retrieved with respect to a probability density function, P. With a minimal number of assumptions, the 'energy' function that a neural network minimizes during information retrieval is shown to uniquely specify P. Inspection of the form of P indicates the class of probabilistic environments that can be learned. Learning algorithms can be analyzed and designed by using maximum likelihood estimation techniques to estimate the parameters of P. The large class of nonlinear auto-associative networks analyzed by Cohen and Grossberg (1983), nonlinear associative multi-layer back-propagation networks (Rumelhart, Hinton, & Williams, 1986), and certain classes of nonlinear multi-stage networks are analyzed within the proposed computational framework. Keywords: Artificial intelligence, Connectionism, Non-linear associator.

[1]  José L. Marroquín,et al.  Probabilistic solution of inverse problems , 1985 .

[2]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[3]  B. Roy Frieden,et al.  Estimating Occurrence Laws with Maximum Probability, and the Transition to Entropic Estimators , 1985 .

[4]  Geoffrey E. Hinton,et al.  Experiments on Learning by Back Propagation. , 1986 .

[5]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[6]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Geoffrey E. Hinton,et al.  Parallel Models of Associative Memory , 1989 .

[8]  B. Frieden Unified theory for estimating frequency-of-occurrence laws and optical objects , 1983 .

[9]  James L. McClelland,et al.  Information Processing in Dynamical Systems: Foundations of Harmony Theory , 1987 .

[10]  S. Grossberg Studies of mind and brain : neural principles of learning, perception, development, cognition, and motor control , 1982 .

[12]  E. Eweda,et al.  Second-order convergence analysis of stochastic adaptive linear filtering , 1983 .

[13]  F. Fairman Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.

[14]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Stephen A. Ritz,et al.  Distinctive features, categorical perception, and probability learning: some applications of a neural model , 1977 .

[16]  G. F. Simmons Differential Equations With Applications and Historical Notes , 1972 .

[17]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[18]  B. Noble Applied Linear Algebra , 1969 .

[19]  Stephen Grossberg Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987 .

[20]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[21]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[22]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[23]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[24]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[25]  Ben P. Wise An experimental comparison of uncertain inference systems (artificial intelligence, probability, entropy) , 1986 .

[26]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[27]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[28]  David B. Cooper,et al.  Simple Parallel Hierarchical and Relaxation Algorithms for Segmenting Noncausal Markovian Random Fields , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  R. Gallager Information Theory and Reliable Communication , 1968 .