GRAIL: A Goal-Discovering Robotic Architecture for Intrinsically-Motivated Learning

In this paper, we present goal-discovering robotic architecture for intrisically-motivated learning (GRAIL), a four-level architecture that is able to autonomously: 1) discover changes in the environment; 2) form representations of the goals corresponding to those changes; 3) select the goal to pursue on the basis of intrinsic motivations (IMs); 4) select suitable computational resources to achieve the selected goal; 5) monitor the achievement of the selected goal; and 6) self-generate a learning signal when the selected goal is successfully achieved. Building on previous research, GRAIL exploits the power of goals and competence-based IMs to autonomously explore the world and learn different skills that allow the robot to modify the environment. To highlight the features of GRAIL, we implement it in a simulated iCub robot and test the system in four different experimental scenarios where the agent has to perform reaching tasks within a 3-D environment.

[1]  Harlow Hf Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. , 1950 .

[2]  D. Berlyne NOVELTY AND CURIOSITY AS DETERMINANTS OF EXPLORATORY BEHAVIOUR1 , 1950 .

[3]  R. Butler Discrimination learning by rhesus monkeys to visual-exploration motivation. , 1953, Journal of comparative and physiological psychology.

[4]  K. Montgomery The role of the exploratory drive in learning. , 1954, Journal of comparative and physiological psychology.

[5]  G. B. Kish Learning when the onset of illumination is used as reinforcing stimulus. , 1955, Journal of comparative and physiological psychology.

[6]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[7]  D. Berlyne Curiosity and exploration. , 1966, Science.

[8]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[10]  K. Miller,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[11]  Dana H. Ballard,et al.  Animate Vision , 1991, Artif. Intell..

[12]  Stewart W. Wilson,et al.  A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .

[13]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[14]  E. Rolls,et al.  Neural networks and brain function , 1998 .

[15]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[16]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[17]  Alexandre Pouget,et al.  Computational approaches to sensorimotor transformations , 2000, Nature Neuroscience.

[18]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[19]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[20]  Minoru Asada,et al.  Cognitive developmental robotics as a new paradigm for the design of humanoid robots , 2001, Robotics Auton. Syst..

[21]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[22]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[23]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[24]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[25]  Wulfram Gerstner,et al.  Mathematical formulations of Hebbian learning , 2002, Biological Cybernetics.

[26]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[27]  Gianluca Baldassarre,et al.  A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviours , 2002, Cognitive Systems Research.

[28]  Gianluca Baldassarre,et al.  Forward and Bidirectional Planning Based on Reinforcement Learning and Neural Networks in a Simulated Robot , 2003, ABiALS.

[29]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[30]  G. Rainer,et al.  Cognitive neuroscience: Neural mechanisms for detecting and remembering novel events , 2003, Nature Reviews Neuroscience.

[31]  Giulio Sandini,et al.  Developmental robotics: a survey , 2003, Connect. Sci..

[32]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[33]  C. Hofsten An action perspective on motor development , 2004, Trends in Cognitive Sciences.

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[35]  Bram Bakker,et al.  Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .

[36]  Mehdi Khamassi,et al.  Combining Self-organizing Maps with Mixtures of Experts: Application to an Actor-Critic Model of Reinforcement Learning in the Basal Ganglia , 2006, SAB.

[37]  P. Redgrave,et al.  The short-latency dopamine signal: a role in discovering novel actions? , 2006, Nature Reviews Neuroscience.

[38]  G. Baldassarre,et al.  Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[39]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[40]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[41]  Marco Mirolli,et al.  Evolving Childhood's Length and Learning Parameters in an Intrinsically Motivated Reinforcement Learning Robot , 2007 .

[42]  N. Daw,et al.  Striatal Activity Underlies Novelty-Based Choice in Humans , 2008, Neuron.

[43]  Jan Peters,et al.  Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[44]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[45]  Autonomously Learning an Action Hierarchy Using a Learned Qualitative State Representation , 2009, IJCAI.

[46]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[47]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[48]  Ethan S. Bromberg-Martin,et al.  Midbrain Dopamine Neurons Signal Preference for Advance Information about Upcoming Rewards , 2009, Neuron.

[49]  Lisa Meeden,et al.  Category-based intrinsic motivation , 2009, EpiRob.

[50]  Pierre-Yves Oudeyer,et al.  Robust intrinsically motivated exploration and active learning , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[51]  Andrew G. Barto,et al.  Competence progress intrinsic motivation , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[52]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[53]  Peter Dayan,et al.  Pavlovian-Instrumental Interaction in ‘Observing Behavior’ , 2010, PLoS Comput. Biol..

[54]  Giulio Sandini,et al.  The iCub humanoid robot: An open-systems platform for research in cognitive development , 2010, Neural Networks.

[55]  Mark H. Lee,et al.  Integration of Active Vision and Reaching From a Developmental Robotics Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[56]  M. Guitart-Masip,et al.  NOvelty-related Motivation of Anticipation and exploration by Dopamine (NOMAD): Implications for healthy aging , 2010, Neuroscience & Biobehavioral Reviews.

[57]  Domenico Parisi,et al.  A Bioinspired Hierarchical Reinforcement Learning Architecture for Modeling Learning of Multiple Skills with Continuous States and Actions , 2010, EpiRob.

[58]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[59]  Jochen J. Steil,et al.  Goal Babbling Permits Direct Learning of Inverse Kinematics , 2010, IEEE Transactions on Autonomous Mental Development.

[60]  Marco Mirolli,et al.  Biological Cumulative Learning through Intrinsic Motivations: A Simulated Robotic Study on the Development of Visually-Guided Reaching , 2010, EpiRob.

[61]  Jochen J. Steil,et al.  Online Goal Babbling for rapid bootstrapping of inverse models in high dimensions , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[62]  Peter Ford Dominey,et al.  Robot Cognitive Control with a Neurophysiologically Inspired Reinforcement Learning Model , 2011, Front. Neurorobot..

[63]  Gianluca Baldassarre,et al.  What are intrinsic motivations? A biological perspective , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[64]  Paolo Tommasino,et al.  Reinforcement learning algorithms that assimilate and accommodate skills with multiple tasks , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[65]  Kathryn E. Merrick,et al.  Intrinsic Motivation and Introspection in Reinforcement Learning , 2012, IEEE Transactions on Autonomous Mental Development.

[66]  Marco Mirolli,et al.  Intrinsic motivation mechanisms for competence acquisition , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[67]  Stefan Schaal,et al.  Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.

[68]  Jürgen Schmidhuber,et al.  Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.

[69]  Marco Mirolli,et al.  Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study , 2013, Neural Networks.

[70]  A. Barto,et al.  Novelty or Surprise? , 2013, Front. Psychol..

[71]  Martin A. Riedmiller,et al.  Modeling effects of intrinsic and extrinsic rewards on the competition between striatal learning systems , 2013, Front. Psychol..

[72]  Jürgen Leitner,et al.  Learning visual object detection and localisation using icVision , 2013, BICA 2013.

[73]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[74]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[75]  Marco Mirolli,et al.  Intrinsically Motivated Learning in Natural and Artificial Systems , 2013 .

[76]  Yu Zhao,et al.  Robust active binocular vision through intrinsically motivated learning , 2013, Front. Neurorobot..

[77]  Marco Mirolli,et al.  Computational and Robotic Models of the Hierarchical Organization of Behavior , 2013, Springer Berlin Heidelberg.

[78]  Frank Kirchner,et al.  Incremental learning of skill collections based on intrinsic motivation , 2013, Front. Neurorobot..

[79]  V. Santucci Intrinsic motivation signals for driving the acquisition of multiple tasks : A simulated robotic study , 2013 .

[80]  Francesco Mannella,et al.  Intrinsically motivated action-outcome learning and goal-based action recall: a system-level bio-constrained computational model. , 2013, Neural networks : the official journal of the International Neural Network Society.

[81]  M. Asada,et al.  A motivation model for interaction between parent and child based on the need for relatedness , 2013, Front. Psychol..

[82]  Marco Mirolli,et al.  Functions and Mechanisms of Intrinsic Motivations , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[83]  Marco Mirolli,et al.  Which is the best intrinsic motivation signal for learning multiple skills? , 2013, Front. Neurorobot..

[84]  Jürgen Schmidhuber,et al.  Confidence-based progress-driven self-generated goals for skill acquisition in developmental robots , 2013, Front. Psychol..

[85]  Minoru Asada,et al.  Autonomous development of goals: From generic rewards to goal and self detection , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[86]  E. Visalberghi,et al.  Exploration and learning in capuchin monkeys (Sapajus spp.): the role of action–outcome contingencies , 2014, Animal Cognition.

[87]  Raymond J. Dolan,et al.  Keep focussing: striatal dopamine multiple functions resolved in a single mechanism tested in a simulated humanoid robot , 2014, Front. Psychol..

[88]  Marco Mirolli,et al.  Autonomous selection of the “what” and the “how” of learning: An intrinsically motivated system tested with a two armed robot , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[89]  E. Guglielmelli,et al.  Development of goal-directed action selection guided by intrinsic motivations: an experiment with children , 2014, Experimental Brain Research.

[90]  Pierre-Yves Oudeyer,et al.  Self-organization of early vocal development in infants and machines: the role of intrinsic motivation , 2014, Front. Psychol..

[91]  Stefano Nolfi,et al.  Designing adaptive humanoid robots through the FARSA open-source framework , 2014, Adapt. Behav..

[92]  A. Cangelosi,et al.  Developmental Robotics: From Babies to Robots , 2015 .

[93]  Dimitri Ognibene,et al.  Ecological Active Vision: Four Bioinspired Principles to Integrate Bottom–Up and Adaptive Top–Down Attention Tested With a Simple Camera-Arm Robot , 2015, IEEE Transactions on Autonomous Mental Development.

[94]  Kae Nakamura,et al.  Predictive Reward Signal of Dopamine Neurons , 2015 .

[95]  Peter Stone,et al.  Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..

[96]  Marijn F. Stollenga,et al.  Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots , 2017, Artif. Intell..