Integrating learning by experience and demonstration in autonomous robots

We propose an integrated learning by experience and demonstration algorithm that operates on the basis of both an objective scalar measure of performance and a demonstrated behaviour. The application of the method to two qualitatively different experimental scenarios involving simulated mobile robots demonstrates its efficacy. Indeed, the analysis of the obtained results shows that the robots trained through this integrated algorithm develop solutions that are functionally better than those obtained by using either a pure learning by demonstration, or a pure learning by experience algorithm. This is because the algorithm drives the learning process toward solutions that are qualitatively similar to the demonstration, but leaves the learning agent free to differentiate from the demonstration when this turns out to be necessary to maximize performance.

[1]  Stefan Schaal,et al.  Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.

[2]  Stefano Nolfi,et al.  Evolving coordinated group behaviours through maximisation of mean mutual information , 2008, Swarm Intelligence.

[3]  Victor Uc Cetina Supervised reinforcement learning using behavior models , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[4]  Stefano Nolfi,et al.  Evolution of Communication and Language in Embodied Agents , 2009 .

[5]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[6]  Francesco Mondada,et al.  The marXbot, a miniature mobile robot opening new perspectives for the collective-robotic research , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[8]  H. P. Schwefel,et al.  Numerische Optimierung von Computermodellen mittels der Evo-lutionsstrategie , 1977 .

[9]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[10]  Stefano Nolfi,et al.  Learning of Manipulation Capabilities in a Humanoid Robot , 2022 .

[11]  Pieter Abbeel,et al.  Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[12]  Jennie Si,et al.  Supervised ActorCritic Reinforcement Learning , 2004 .

[13]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .

[14]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[15]  Jan Peters,et al.  Probabilistic Movement Primitives , 2013, NIPS.

[16]  Victor Uc-Cetina Supervised Reinforcement Learning Using Behavior Models , 2007, ICMLA 2007.

[17]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[18]  Sonia Chernova,et al.  Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.

[19]  Jeffrey L. Krichmar,et al.  Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines , 2001, Complex..

[20]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[21]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[22]  T. Michael Knasel,et al.  Robotics and autonomous systems , 1988, Robotics Auton. Syst..

[23]  Tom Schaul,et al.  Exponential natural evolution strategies , 2010, GECCO '10.

[24]  Brett Browning,et al.  Learning robot motion control with demonstration and advice-operators , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Stefano Nolfi,et al.  FARSA: An Open Software Tool for Embodied Cognitive Science , 2013, ECAL.

[26]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[27]  Pieter Abbeel,et al.  Learning for control from multiple demonstrations , 2008, ICML '08.

[28]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[29]  Jan Peters,et al.  Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[30]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[31]  Aude Billard,et al.  Discriminative and adaptive imitation in uni-manual and bi-manual tasks , 2006, Robotics Auton. Syst..

[32]  Hans-Paul Schwefel,et al.  Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[33]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[34]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[35]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[36]  Stefan Schaal,et al.  Learning from Demonstration , 1996, NIPS.

[37]  Brett Browning,et al.  Mobile Robot Motion Control from Demonstration and Corrective Feedback , 2010, From Motor Learning to Interaction Learning in Robots.

[38]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[39]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[40]  Stefano Nolfi,et al.  Evolution of Implicit and Explicit Communication in Mobile Robots , 2010, Evolution of Communication and Language in Embodied Agents.

[41]  Peter Stone,et al.  Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.

[42]  Michael T. Rosenstein,et al.  Supervised Actor‐Critic Reinforcement Learning , 2012 .

[43]  Thomas G. Dietterich,et al.  Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.

[44]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[45]  A. Barto,et al.  LEARNING AND APPROXIMATE DYNAMIC PROGRAMMING Scaling Up to the Real World , 2003 .

[46]  Francesco Mondada,et al.  Mobile Robot Miniaturisation: A Tool for Investigation in Control Algorithms , 1993, ISER.