论文信息 - Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior

Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior

This article investigates how a motivational module can drive an animat to learn a sensorimotor cognitive map and use it to generate flexible goal-directed behavior. Inspired by the rat’s hippocampus and neighboring areas, the time growing neural gas (TGNG) algorithm is used, which iteratively builds such a map by means of temporal Hebbian learning. The algorithm is combined with a motivation module, which activates goals, priorities, and consequent activity gradients in the developing cognitive map for the self-motivated control of behavior. The resulting motivated TGNG thus combines a neural cognitive map learning process with top-down, self-motivated, anticipatory behavior control mechanisms. While the algorithms involved are kept rather simple, motivated TGNG displays several emergent behavioral patterns, self-sustainment, and reliable latent learning. We conclude that motivated TGNG constitutes a solid basis for future studies on self-motivated cognitive map learning, on the design of further enhanced systems with additional cognitive modules, and on the realization of highly adaptive, interactive, goal-directed, cognitive systems.

Martin V. Butz | Elshad Shirinov | Kevin L. Reif | Martin Volker Butz | Elshad Shirinov

[1] David Kortenkamp,et al. Topological Mapping for Mobile Robots Using a Combination of Sonar and Vision Sensing , 1994, AAAI.

[2] Marc Toussaint,et al. Learning a World Model and Planning with a Self-Organizing, Dynamic Neural System , 2003, NIPS.

[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[4] Benjamin Kuipers,et al. Towards a general theory of topological maps , 2004, Artif. Intell..

[5] Martin V. Butz,et al. Distinction between types of motivations: Emergent behavior with a neural, model-based reinforcement learning system , 2009, 2009 IEEE Symposium on Artificial Life.

[6] David J. Foster,et al. A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.

[7] Steven M. LaValle,et al. Planning algorithms , 2006 .

[8] Frederick L Crabbe,et al. Compromise strategies for action selection , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[9] Sandra Clara Gadanho,et al. Robot Learning Driven by Emotions , 2001, Adapt. Behav..

[10] Martin V. Butz,et al. Bridging the Gap: Learning Sensorimotor-Linked Population Codes for Planning and Motor Control , 2008 .

[11] A D Redish,et al. Prediction, sequences and the hippocampus , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[12] M. Posner,et al. Components of visual orienting , 1984 .

[13] Mark Humphrys,et al. Action Selection methods using Reinforcement Learning , 1996 .

[14] A. Greenwald,et al. Sensory feedback mechanisms in performance control: with special reference to the ideo-motor mechanism. , 1970, Psychological review.

[15] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.

[16] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[17] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[18] J. Konczak,et al. The development toward stereotypic arm kinematics during reaching in the first 3 years of life , 1997, Experimental Brain Research.

[19] D. Bouwhuis,et al. Attention and performance X : control of language processes , 1986 .

[20] Philippe Gaussier,et al. From view cells and place cells to cognitive map learning: processing stages of the hippocampal system , 2002, Biological Cybernetics.

[21] Gilles Venturini,et al. Adaptation in dynamic environments through a minimal probability of exploration , 1994 .

[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23] Neil Burgess,et al. Predictions derived from modelling the hippocampal role in navigation , 2000, Biological Cybernetics.

[24] Philippe Gaussier,et al. Transition Cells for Navigation and Planning in an Unknown Environment , 2006, SAB.

[25] G. Aschersleben,et al. The Theory of Event Coding (TEC): a framework for perception and action planning. , 2001, The Behavioral and brain sciences.

[26] Rodney A. Brooks,et al. Learning a Distributed Map Representation Based on Navigation Behaviors , 1999 .

[27] Roland Vollgraf,et al. From grids to places , 2007, Journal of Computational Neuroscience.

[28] Peter Redgrave,et al. Layered Control Architectures in Robots and Vertebrates , 1999, Adapt. Behav..

[29] David J. Foster,et al. Reverse replay of behavioural sequences in hippocampal place cells during the awake state , 2006, Nature.

[30] F. W. Irwin. Purposive Behavior in Animals and Men , 1932, The Psychological Clinic.

[31] Martin V. Butz,et al. Anticipatory Behavior in Adaptive Learning Systems, From Brains to Individual and Social Behavior [the book is a result from the third workshop on anticipatory behavior in adaptive learning systems, ABiALS 2006, Rome, Italy, September 30, 2006, colocated with SAB 2006] , 2007, ABiALS book.

[32] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[33] George Dimitri Konidaris,et al. An Architecture for Behavior-Based Reinforcement Learning , 2005, Adapt. Behav..

[34] Philippe Capdepuy,et al. Construction of an Internal Predictive Model by Event Anticipation , 2007, SAB ABiALS.

[35] R. A. Brooks,et al. Intelligence without Representation , 1991, Artif. Intell..

[36] Michael A. Arbib,et al. Who Needs Emotions? - The brain meets the robot , 2004, Who Needs Emotions?.

[37] J. P. Seward. An experimental analysis of latent learning. , 1949, Journal of experimental psychology.

[38] Martin V. Butz,et al. Efiective Online Detection of Task-Independent Landmarks , 2004 .

[39] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.

[40] J. Banquet,et al. Spatial Navigation and Hippocampal Place Cell Firing: The Problem of Goal Encoding , 2004, Reviews in the neurosciences.

[41] Michael Jenkin,et al. Robotic exploration as graph construction , 1991, IEEE Trans. Robotics Autom..

[42] Adam Johnson,et al. Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.

[43] C WiemerJan. The time-organized map algorithm , 2003 .

[44] Joana Hois,et al. A belief-based architecture for scene analysis: From sensorimotor features to knowledge and ontology , 2009, Fuzzy Sets Syst..

[45] Geoffrey E. Hinton,et al. GTM through time , 1997 .

[46] Benjamin Kuipers,et al. Map Learning with Uninterpreted Sensors and Effectors , 1995, Artif. Intell..

[47] Angelo Arleo,et al. Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity , 2000, Biological Cybernetics.

[48] T. Prescott,et al. Is there a brainstem substrate for action selection? , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[49] Bruce L. McNaughton,et al. Path integration and the neural basis of the 'cognitive map' , 2006, Nature Reviews Neuroscience.

[50] Stephen R. Marsland,et al. Fast, On-Line Learning of Globally Consistent Maps , 2002, Auton. Robots.

[51] M. Arbib,et al. Tool use and the distalization of the end-effector , 2009, Psychological research.

[52] Kathryn E. Merrick,et al. Motivated Learning from Interesting Events: Adaptive, Multitask Learning Agents for Complex Environments , 2009, Adapt. Behav..

[53] Ralf Der,et al. From Motor Babbling to Purposive Actions: Emerging Self-exploration in a Dynamical Systems Approach to Early Robot Development , 2006, SAB.

[54] Martin V. Butz,et al. Explorations of anticipatory behavioral control (ABC): a report from the cognitive psychology unit of the University of Würzburg , 2007, Cognitive Processing.

[55] Bernd Fritzke,et al. A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[56] R. Passingham. The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.

[57] Andrew G. Barto,et al. An Adaptive Robot Motivational System , 2006, SAB.

[58] E. Save,et al. Coding for spatial goals in the prelimbic/infralimbic area of the rat frontal cortex. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[59] Wolfram Schenck,et al. Bootstrapping Cognition from Behavior - A Computerized Thought Experiment , 2008, Cogn. Sci..

[60] D. Roy,et al. A Habit System for an Interactive Robot , 2005 .

[61] M. Moser,et al. Representation of Geometric Borders in the Entorhinal Cortex , 2008, Science.

[62] J. F. Herbart. Psychologie als Wissenschaft : neu gegründet auf Erfahrung, Metaphysik und Mathematik , 1824 .

[63] Jan C. Wiemer,et al. The Time-Organized Map Algorithm: Extending the Self-Organizing Map to Spatiotemporal Signals , 2003, Neural Computation.

[64] Ariane S Etienne,et al. Path integration in mammals , 2004, Hippocampus.

[65] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[66] Martin V. Butz,et al. Biasing Exploration in an Anticipatory Learning Classifier System , 2001, IWLCS.

[67] Jason Fleischer,et al. Neural Correlates of Anticipation in Cerebellum, Basal Ganglia, and Hippocampus , 2007, SAB ABiALS.

[68] E. Rolls,et al. Self-organizing continuous attractor networks and path integration: two-dimensional models of place cells , 2002, Network.

[69] Michael Recce,et al. A model of hippocampal function , 1994, Neural Networks.

[70] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[71] Jean-Arcady Meyer,et al. Adaptive Behavior , 2005 .

[72] B Fritzke,et al. A growing neural gas network learns topologies. G. Tesauro, DS Touretzky, and TK Leen, editors , 1995, NIPS 1995.

[73] Pattie Maes,et al. A bottom-up mechanism for behavior selection in an artificial creature , 1991 .

[74] Toshiyuki Nakagaki,et al. Amoebae anticipate periodic events. , 2008, Physical review letters.

[75] Emilio Kropff,et al. Place cells, grid cells, and the brain's spatial representation system. , 2008, Annual review of neuroscience.

[76] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.

[77] E. Save,et al. Flexible use of proximal objects and distal cues by hippocampal place cells , 2007, Hippocampus.

[78] Stephen Grossberg,et al. Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[79] Marc Toussaint,et al. A Sensorimotor Map: Modulating Lateral Interactions for Anticipation and Planning , 2006, Neural Computation.

[80] E. Rolls,et al. A computational theory of hippocampal function, and empirical tests of the theory , 2006, Progress in Neurobiology.

[81] Benjamin Kuipers,et al. Modeling Spatial Knowledge , 1978, IJCAI.

[82] Dolores Cañamero,et al. Modeling motivations and emotions as a basis for intelligent behavior , 1997, AGENTS '97.

[83] Ralf Der,et al. Predictive information and explorative behavior of autonomous robots , 2008 .

[84] Benjamin Kuipers,et al. A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations , 1991, Robotics Auton. Syst..

[85] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[86] D. Spalding. The Principles of Psychology , 1873, Nature.

[87] A. Berthoz,et al. Multisensory processing in the elaboration of place and head direction responses by limbic system neurons. , 2002, Brain research. Cognitive brain research.

[88] Mark Humphreys,et al. Action selection methods using reinforcement learning , 1997 .

[89] Andrea Bonarini,et al. Incremental Skill Acquisition for Self-motivated Learning Animats , 2006, SAB.

[90] Stewart W. Wilson,et al. A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .

[91] Dana H. Ballard,et al. Multiple-Goal Reinforcement Learning with Modular Sarsa(0) , 2003, IJCAI.