Simulation Experiments with Goal-Seeking Adaptive Elements.

Abstract : This report describes results obtained from computer simulation experiments designed to systematically develop and evaluate an approach to learning by networks of neuronlike adaptive elements. The characterizing feature of this approach is that the network components, or adaptive elements, are self-interested, goal-seeking agents that implement robust algorithms for furthering their individual interests. These algorithms combine stochastic search and associative learning methods. Results show how these adaptive elements can cooperate as components of layered networks that adaptively create new features by combining existing features. Other simulation experiments are described that systematically examine the performance of individual adaptive elements in tasks that resemble the tasks faced by elements embedded within networks. The performances of a variety of algorithms are compared and contrasted. These results demonstrate the shortcomings of certain algorithms and the effectiveness of others. Experiments examine the effects on learning of delayed reinforcement. We justify the adaptive heuristic critic algorithm for reducing the severity of the temporal credit-assignment problem for tasks with delayed reinforcement. We illustrate the effectiveness of this algorithm in a task requiring the system to learn to control an unstable dynamical system under the influence of low-quality evaluative feedback.

[1]  W. A. Clark,et al.  Simulation of self-organizing systems by digital computer , 1954, Trans. IRE Prof. Group Inf. Theory.

[2]  F. Restle A theory of discrimination learning. , 1955, Psychological review.

[3]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .

[4]  R. N. Bradt,et al.  On Sequential Designs for Maximizing the Sum of $n$ Observations , 1956 .

[5]  J. Laurie Snell,et al.  Studies in mathematical learning theory. , 1960 .

[6]  D. LaBerge Generalization gradients in a discrimination situation. , 1961, Journal of experimental psychology.

[7]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[8]  T. Carterette An application of stimulus sampling theory to summated generalization. , 1961 .

[9]  B. Widrow,et al.  Generalization and information storage in network of adaline 'neurons' , 1962 .

[10]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[11]  Nils J. Nilsson,et al.  DETERMINATION AND DETECTION OF FEATURES IN PATTERNS , 1963 .

[12]  E. Feigenbaum,et al.  Computers and Thought , 1963 .

[13]  Eugene Galanter,et al.  Handbook of mathematical psychology: I. , 1963 .

[14]  Julius T. Tou,et al.  Computer and information sciences : collected papers on learning, adaptation and control in information systems , 1964 .

[15]  Richard C. Atkinson,et al.  Stimulus Sampling Theory , 1967 .

[16]  Harley Bornbach,et al.  An introduction to mathematical learning theory , 1967 .

[17]  M. P. Friedman,et al.  Tests of a mixed model for paired-associates learning with overlapping stimuli ☆ , 1967 .

[18]  C. H. WADDINGTON,et al.  Towards a Theoretical Biology , 1968, Nature.

[19]  D. Lawrence,et al.  Attention in Discrimination Learning , 1969 .

[20]  A. Klopf,et al.  An Evolutionary Pattern Recognition Network , 1969 .

[21]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[22]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[23]  Thomas M. Cover,et al.  The two-armed-bandit problem with time-invariant finite memory , 1970, IEEE Trans. Inf. Theory.

[24]  A. S. Harding Markovian decision processes , 1970 .

[25]  N. Mackintosh,et al.  Mechanisms of animal discrimination learning , 1971 .

[26]  A. G. Ivakhnenko,et al.  Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..

[27]  A. H. Klopf,et al.  Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[28]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[29]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[30]  M. L. Tsetlin,et al.  Automaton theory and modeling of biological systems , 1973 .

[31]  Kumpati S. Narendra,et al.  Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[32]  Michael D. Alder,et al.  A Convergence Theorem for Hierarchies of Model Neurones , 1975, SIAM J. Comput..

[33]  Ian H. Witten,et al.  An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[34]  G. Edelman,et al.  The Mindful Brain: Cortical Organization and the Group-Selective Theory of Higher Brain Function , 1978 .

[35]  Dana H. Ballard,et al.  Parameter Networks: Towards a Theory of Low-Level Vision , 1981, IJCAI.

[36]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[37]  Avron Barr,et al.  The Handbook of Artificial Intelligence, Volume 1 , 1982 .

[38]  Richard S. Sutton,et al.  Goal Seeking Components for Adaptive Intelligence: An Initial Assessment. , 1981 .

[39]  Barr and Feigenbaum Edward A. Avron,et al.  The Handbook of Artificial Intelligence , 1981 .

[40]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. , 1982, Psychological review.

[41]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[42]  R. Sutton,et al.  Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element , 1982, Behavioural Brain Research.

[43]  John S. Edwards,et al.  The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .

[44]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.