Modular active curiosity-driven discovery of tool use

This article studies algorithms used by a learner to explore high-dimensional structured sensorimotor spaces such as in tool use discovery. In particular, we consider goal babbling architectures that were designed to explore and learn solutions to fields of sensorimotor problems, i.e. to acquire inverse models mapping a space of parameterized sensorimotor problems/effects to a corresponding space of parameterized motor primitives. However, so far these architectures have not been used in high-dimensional spaces of effects. Here, we show the limits of existing goal babbling architectures for efficient exploration in such spaces, and introduce a novel exploration architecture called Model Babbling (MB). MB exploits efficiently a modular representation of the space of parameterized problems/effects. We also study an active version of Model Babbling (the MACOB architecture). These architectures are compared in a simulated experimental setup with an arm that can discover and learn how to move objects using two tools with different properties, embedding structured high-dimensional continuous motor and sensory spaces.

[1]  Pierre-Yves Oudeyer,et al.  Curiosity-Driven Development of Tool Use Precursors: a Computational Model , 2016, CogSci.

[2]  Pierre-Yves Oudeyer,et al.  Overlapping waves in tool use development: A curiosity-driven computational model , 2016, 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[3]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[4]  Olivier Sigaud,et al.  Learning compact parameterized skills with a single regression , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[5]  Pierre-Yves Oudeyer,et al.  Socially guided intrinsic motivation for robot learning of motor skills , 2014, Auton. Robots.

[6]  Pierre-Yves Oudeyer,et al.  Explauto: an open-source Python library to study autonomous exploration in developmental robotics , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[7]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[8]  Freek Stulp,et al.  Simultaneous on-line Discovery and Improvement of Robotic Skill options , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Pierre-Yves Oudeyer,et al.  Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner , 2012, Paladyn J. Behav. Robotics.

[10]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[11]  Jochen J. Steil,et al.  Goal Babbling Permits Direct Learning of Inverse Kinematics , 2010, IEEE Transactions on Autonomous Mental Development.

[12]  Yukie Nagai,et al.  Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese , 2015, IEEE Transactions on Autonomous Mental Development.

[13]  Marco Mirolli,et al.  Intrinsically Motivated Learning in Natural and Artificial Systems , 2013 .

[14]  Bruno Castro da Silva,et al.  Learning Parameterized Skills , 2012, ICML.

[15]  Pierre-Yves Oudeyer,et al.  Self-organization of early vocal development in infants and machines: the role of intrinsic motivation , 2014, Front. Psychol..

[16]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[17]  Olivier Sigaud,et al.  Robot Skill Learning: From Reinforcement Learning to Evolution Strategies , 2013, Paladyn J. Behav. Robotics.

[18]  Dirk Kraft,et al.  A Survey of the Ontogeny of Tool Use: From Sensorimotor Experience to Planning , 2013, IEEE Transactions on Autonomous Mental Development.

[19]  Jan Peters,et al.  Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.

[20]  Alexander Fabisch,et al.  Active contextual policy search , 2014, J. Mach. Learn. Res..

[21]  J. Santos-Victor,et al.  Robotic tool use and problem solving based on probabilistic planning and learned affordances , 2015 .

[22]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[23]  Pierre-Yves Oudeyer,et al.  Exploration strategies in developmental robotics: A unified probabilistic framework , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[24]  Jun Morimoto,et al.  Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.

[25]  A. Cangelosi,et al.  Developmental Robotics: From Babies to Robots , 2015 .

[26]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[27]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[28]  G. Metta,et al.  Exploring affordances and tool use on the iCub , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[29]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[30]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .