Internal Models and Anticipations in Adaptive Learning Systems

The explicit investigation of anticipations in relation to adaptive behavior is a recent approach. This chapter first provides psychological background that motivates and inspires the study of anticipations in the adaptive behavior field. Next, a basic framework for the study of anticipations in adaptive behavior is suggested. Different anticipatory mechanisms are identified and characterized. First fundamental distinctions are drawn between implicit anticipatory behavior, payoff anticipatory behavior, sensory anticipatory behavior, and state anticipatory behavior. A case study allows further insights into the drawn distinctions. Many future research direction are suggested.

[1]  F. W. Irwin Purposive Behavior in Animals and Men , 1932, The Psychological Clinic.

[2]  W. Kunde,et al.  Response-effect compatibility in manual choice reaction tasks. , 2001, Journal of experimental psychology. Human perception and performance.

[3]  Martin V. Butz,et al.  First Cognitive Capabilities in the Anticipatory Classifier System , 2000 .

[4]  D. V. von Cramon,et al.  Functional organization of the lateral premotor cortex: fMRI reveals different regions activated by anticipation of object properties, location and speed. , 2001, Brain research. Cognitive brain research.

[5]  D.E. Goldberg,et al.  Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..

[6]  Jun Tani,et al.  An Interpretation of the "Self" From the Dynamical Systems Perspective: A Constructivist Approach , 1998 .

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Olivier Sigaud,et al.  YACS: a new learning classifier system using anticipation , 2002, Soft Comput..

[9]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[10]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[11]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[12]  R. Rescorla,et al.  Postconditioning devaluation of a reinforcer affects instrumental responding. , 1985 .

[13]  John H. Holland,et al.  Properties of the Bucket Brigade , 1985, ICGA.

[14]  Shumeet Baluja,et al.  Using the Representation in a Neural Network's Hidden Layer for Task-Specific Focus of Attention , 1995, IJCAI.

[15]  A. Zeman Attentional Processing. The Brain's Art of Mindfulness , 1996 .

[16]  Jing Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..

[17]  B. Skinner Beyond Freedom and Dignity , 1972 .

[18]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[19]  M. Witkowski Anticipatory Learning : The Animat as Discovery Engine , 2002 .

[20]  John H. Holland,et al.  Properties of the bucket brigade algorithm , 1985 .

[21]  Evan Thompson,et al.  Empathy and consciousness. , 2001 .

[22]  Martin V. Butz,et al.  Anticipatory Learning Classifier Systems , 2002, Genetic Algorithms and Evolutionary Computation.

[23]  J. P. Seward An experimental analysis of latent learning. , 1949, Journal of experimental psychology.

[24]  Pier Luca Lanzi,et al.  Learning classifier systems from a reinforcement learning perspective , 2002, Soft Comput..

[25]  Paul Davidsson,et al.  Learning by Linear Anticipation in Multi-Agent Systems , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[26]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[27]  Rodney A. Brooks,et al.  Intelligence Without Reason , 1991, IJCAI.

[28]  Christopher D. Adams,et al.  Instrumental Responding following Reinforcer Devaluation , 1981 .

[29]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[30]  Bruce Edmonds,et al.  Exploring the Value of Prediction in an Artificial Stock Market , 2003, ABiALS.

[31]  Wolfgang Stolzmann Antizipative classifier systems , 1997 .

[32]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[33]  Tim Kovacs,et al.  Advances in Learning Classifier Systems , 2001, Lecture Notes in Computer Science.

[34]  Richard S. Sutton,et al.  Reinforcement learning architectures for animats , 1991 .

[35]  Pier Luca Lanzi,et al.  An Analysis of Generalization in the XCS Classifier System , 1999, Evolutionary Computation.

[36]  D. Thistlethwaite A critical review of latent learning and related experiments. , 1951, Psychological bulletin.

[37]  Stefano Nolfi,et al.  Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems , 1998, Neural Networks.

[38]  Stewart W. Wilson Knowledge Growth in an Artificial Animal , 1985, ICGA.

[39]  G. Baldassarre A biologically plausible model of human planning based on neural networks and Dyna-PI models , 2002 .

[40]  John H. Holland,et al.  COGNITIVE SYSTEMS BASED ON ADAPTIVE ALGORITHMS1 , 1978 .

[41]  J. Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.

[42]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[43]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[44]  Richard S. Sutton,et al.  Model-Based Reinforcement Learning with an Approximate, Learned Model , 1996 .

[45]  G. Rizzolatti,et al.  Premotor cortex and the recognition of motor actions. , 1996, Brain research. Cognitive brain research.

[46]  Olivier Sigaud,et al.  YACS: Combining Dynamic Programming with Generalization in Classifier Systems , 2000, IWLCS.

[47]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[48]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[49]  B. Roche,et al.  The Behavior of Organisms? , 1997 .

[50]  J. C. Johnston,et al.  Attention and performance. , 2001, Annual review of psychology.

[51]  Martin V. Butz,et al.  Generalized State Values in an Anticipatory Learning Classifier System , 2003, ABiALS.

[52]  H. Pashler The Psychology of Attention , 1997 .

[53]  J Hoffmann,et al.  Irrelevant response effects improve serial learning in serial reaction time tasks. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[54]  Shumeet Baluja,et al.  Expectation-based selective attention for visual monitoring and control of a robot vehicle , 1997, Robotics Auton. Syst..

[55]  Joachim Hoffmann,et al.  Intentional fixation of behavioural learning, or how R-O learning blocks S-R learning , 2002 .

[56]  J. F. Herbart Psychologie als Wissenschaft : neu gegründet auf Erfahrung, Metaphysik und Mathematik , 1824 .

[57]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[58]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[59]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[60]  Jun Tani,et al.  Model-based learning for mobile robot navigation from the dynamical systems perspective , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[61]  Allen Newell,et al.  Elements of a theory of human problem solving. , 1958 .

[62]  A. Goldman,et al.  Mirror neurons and the simulation theory of mind-reading , 1998, Trends in Cognitive Sciences.

[63]  C. Chabris,et al.  Gorillas in Our Midst: Sustained Inattentional Blindness for Dynamic Events , 1999, Perception.

[64]  E. Tolman The determiners of behavior at a choice point. , 1938 .

[65]  K. Dautenhahn,et al.  Imitation in Animals and Artifacts , 2002 .

[66]  Jochen Triesch,et al.  Modularity and Specialized Learning: Reexamining Behavior-Based Artificial Intelligence , 2004 .

[67]  Olivier Sigaud,et al.  Adding a generalization mechanism to YACS , 2001 .

[68]  E. Thorndike Animal Intelligence; Experimental Studies , 2009 .

[69]  M. Arbib The mirror system, imitation, and the evolution of language , 2002 .

[70]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[71]  Gary L. Drescher,et al.  Made-up minds - a constructivist approach to artificial intelligence , 1991 .

[72]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[73]  V. Gallese The ‘‘shared manifold’’ hypothesis: from mirror neurons to empathy , 2001 .

[74]  Christopher Mark Witkowski,et al.  Schemes for learning and behaviour : a new expectancy model , 2013 .

[75]  Andrew W. Moore,et al.  Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[76]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[77]  Jean-Arcady Meyer,et al.  From Animals to Animats: Proceedings of The First International Conference on Simulation of Adaptive Behavior (Complex Adaptive Systems) , 1990 .

[78]  Magnus Boman,et al.  Anticipatory Guidance of Plot , 2002, ABiALS.

[79]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.