Intrinsic Motivation Systems for Autonomous Mental Development

Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development. The complexity of the robot's activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology

[1]  R. Yerkes Mental Development in the Child and the Race , 1907, The American Naturalist.

[2]  J. Piaget Play, dreams and imitation in childhood , 1951 .

[3]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[4]  Peter Secretan Learning , 1965, Mental Health.

[5]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[6]  Marvin Minsky,et al.  A framework for representing knowledge" in the psychology of computer vision , 1975 .

[7]  P. L. Adams THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[8]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[9]  L. Vygotsky Mind in Society: The Development of Higher Psychological Processes: Harvard University Press , 1978 .

[10]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[11]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[12]  K. Miller,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[13]  M. Csíkszentmihályi Flow: The Psychology of Optimal Experience , 1990 .

[14]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[15]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[16]  T. Watkin,et al.  Selecting examples for perceptrons , 1992 .

[17]  Mark Plutowski,et al.  Selecting concise training sets from clean data , 1993, IEEE Trans. Neural Networks.

[18]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  Gerhard Paass,et al.  Bayesian Query Construction for Neural Network Models , 1994, NIPS.

[21]  C. Moore,et al.  Social Understanding at the End of the First Year of Life , 1994 .

[22]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[23]  D. Lewkowicz,et al.  A dynamic systems approach to the development of cognition and action. , 2007, Journal of cognitive neuroscience.

[24]  M. Csíkszentmihályi Creativity: Flow and the Psychology of Discovery and Invention , 1996 .

[25]  Anthony V. Robins,et al.  Transfer in Cognition , 1996, Connect. Sci..

[26]  Lorien Y. Pratt,et al.  A Survey of Transfer Between Connectionist Networks , 1996, Connect. Sci..

[27]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[28]  Frank Dignum,et al.  Intentional Agents and Goal Formation , 1997, ATAL.

[29]  Kenneth W. Bauer,et al.  Selecting Optimal Experiments for Multiple Output Multilayer Perceptrons , 1997, Neural Computation.

[30]  Jean-Arcady Meyer,et al.  Learning to Perceive the World as Articulated: An Approach for Hierarchical Learning in Sensory-Motor Systems , 1998 .

[31]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[32]  Lorien Y. Pratt,et al.  A Survey of Connectionist Network Reuse Through Transfer , 1998, Learning to Learn.

[33]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[34]  G. Lakoff,et al.  Philosophy in the flesh : the embodied mind and its challenge to Western thought , 1999 .

[35]  Stefano Nolfi,et al.  Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems , 1998, Neural Networks.

[36]  M. Hasenjäger,et al.  Active Learning in Self-Organizing Maps , 1999 .

[37]  Stefano Nolfi,et al.  Extracting Regularities in Space and Time Through a Cascade of Prediction Networks: The Case of a Mobile Robot Navigating in a Structured Environment , 1999, Connect. Sci..

[38]  J. Michael Herrmann,et al.  Learning predictive representations , 2000, Neurocomputing.

[39]  Jun Rekimoto,et al.  CyberCode: designing augmented reality environments with visual tags , 2000, DARE '00.

[40]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[41]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[42]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[43]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[44]  D. Gentner,et al.  The analogical mind : perspectives from cognitive science , 2001 .

[45]  Kunihiko Kaneko,et al.  Complex Systems: Chaos and Beyond , 2001 .

[46]  W. Prinz,et al.  Ego function of early imitation , 2002 .

[47]  M. Hasenjäger,et al.  Active learning in neural networks , 2002 .

[48]  Mitsuo Kawato,et al.  Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[49]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[50]  J. Grady Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought , 2002 .

[51]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[52]  Juyang Weng,et al.  A theory for mentally developing robots , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.

[53]  Andreas Zell,et al.  Different criteria for active learning in neural networks: a comparative study , 2002, ESANN.

[54]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  W. Prinz,et al.  The imitative mind : development, evolution, and brain bases , 2002 .

[56]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[57]  Pierre-Yves Oudeyer,et al.  Motivational principles for visual know-how development , 2003 .

[58]  Randall D. Beer,et al.  The Dynamics of Active Categorical Perception in an Evolved Model Agent , 2003, Adapt. Behav..

[59]  Luc Steels,et al.  The Autotelic Principle , 2003, Embodied Artificial Intelligence.

[60]  Olaf Sporns,et al.  Information-Theoretical Aspects of Embodied Artificial Intelligence , 2003, Embodied Artificial Intelligence.

[61]  Giulio Sandini,et al.  Developmental robotics: a survey , 2003, Connect. Sci..

[62]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[63]  Jean-Christophe Baillie URBI: A UNIVERSAL LANGUAGE FOR ROBOTIC CONTROL , 2004 .

[64]  Minoru Asada,et al.  Purposive behavior acquisition for a real robot by vision-based reinforcement learning , 1995, Machine Learning.

[65]  Jun Tani,et al.  Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB , 2004, Neural Networks.

[66]  Stefan Schaal,et al.  Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.

[67]  Juyang Weng,et al.  Developmental Robotics: Theory and Experiments , 2004, Int. J. Humanoid Robotics.

[68]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[69]  O. Michel WebotsTM: Professional Mobile Robot Simulation , 2004, ArXiv.

[70]  Olivier Michel,et al.  Cyberbotics Ltd. Webots™: Professional Mobile Robot Simulation , 2004 .

[71]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[72]  Douglas S. Blank,et al.  An Emergent Framework For Self-Motivation In Developmental Robotics , 2004 .

[73]  Jean-Christophe Baillie,et al.  URBI: towards a universal robotic low-level programming language , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[74]  M. Cole,et al.  Mind in Society , 2005 .

[75]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[76]  Minoru Asada,et al.  Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.

[77]  M. Tomasello,et al.  Understanding and sharing intentions: The origins of cultural cognition , 2005, Behavioral and Brain Sciences.

[78]  F. Kaplan,et al.  The challenges of joint attention , 2006 .

[79]  Pierre-Yves Oudeyer,et al.  The progress drive hypothesis: an interpretation of early imitation , 2007 .