Model-based reinforcement learning for humanoids: A study on forming rewards with the iCub platform

Technological advancements in robotics and cognitive science are contributing to the development of the field of cognitive robotics. Modern robotic platforms are able to exhibit the ability to learn and reason about complex tasks and to follow behavioural goals in complex environments. Nevertheless, many challenges still exist. One of these great challenges is to equip these robots with cognitive systems that allow them to deal with less constrained situations, beyond constrained scenarios as in industrial robotics. In this work we explore the application of the Reinforcement Learning (RL) paradigm to study the autonomous development of robot controllers without a priori supervised learning. Such a model-based RL architecture is discussed for the cognitive implications of applying RL in humanoid robots. To this end we show a developmental framework for RL in robotics and its implementation and testing for the iCub robotic platform in two novel experimental scenarios. In particular we focus on iCub simulation experiments with comparisons between internal perception-based reward signals and external ones, in order to compare learning performance of the robot guided by its own perception of action's outcomes with the one when the robot has its actions externally evaluated.

[1]  2013 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain, CCMB 2013, Singapore, April 16-19, 2013 , 2013, CCMB.

[2]  Ioannis P. Vlahavas,et al.  Transfer Learning via Multiple Inter-task Mappings , 2011, EWRL.

[3]  Giorgio Metta,et al.  Learning the skill of archery by a humanoid robot iCub , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[4]  A. Cangelosi,et al.  Developmental Robotics: From Babies to Robots , 2015 .

[5]  Henrik Schiøler,et al.  Sociable Robots Through Self-Maintained Energy , 2006 .

[6]  Kikuo Fujimura,et al.  The intelligent ASIMO: system overview and integration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[8]  Giorgio Metta,et al.  YARP: Yet Another Robot Platform , 2006 .

[9]  Peter Ford Dominey,et al.  Robot Cognitive Control with a Neurophysiologically Inspired Reinforcement Learning Model , 2011, Front. Neurorobot..

[10]  Stefan Schaal,et al.  Reinforcement Learning for Humanoid Robotics , 2003 .

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  Peter Stone,et al.  Model-Based Exploration in Continuous State Spaces , 2007, SARA.

[13]  Ben Tse,et al.  Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[14]  Pierre Blazevic,et al.  The NAO humanoid: a combination of performance and affordability , 2008, ArXiv.

[15]  Angelo Cangelosi,et al.  An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator , 2008, PerMIS.

[16]  Masaki Ogino,et al.  Cognitive Developmental Robotics: A Survey , 2009, IEEE Transactions on Autonomous Mental Development.

[17]  Davide Marocco,et al.  Autonomous learning in humanoid robotics through mental imagery. , 2013, Neural networks : the official journal of the International Neural Network Society.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Giorgio Metta,et al.  Towards long-lived robot genes , 2008, Robotics Auton. Syst..

[20]  Giulio Sandini,et al.  The iCub Cognitive Humanoid Robot: An Open-System Research Platform for Enactive Cognition , 2006, 50 Years of Artificial Intelligence.

[21]  Peter Stone,et al.  Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[22]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[23]  Fumiya Iida,et al.  50 Years of Artificial Intelligence, Essays Dedicated to the 50th Anniversary of Artificial Intelligence , 2007, 50 Years of Artificial Intelligence.

[24]  Brian Tanner,et al.  RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[25]  Angelo Cangelosi,et al.  Integration of Speech and Action in Humanoid Robots: iCub Simulation Experiments , 2011, IEEE Transactions on Autonomous Mental Development.