Interactive Teaching for Vision-Based Mobile Robots: A Sensory-Motor Approach

For the last decade, we have been developing a vision-based architecture for mobile robot navigation. Using our bio-inspired model of navigation, robots can perform sensory-motor tasks in real time in unknown indoor as well as outdoor environments. We address here the problem of autonomous incremental learning of a sensory-motor task, demonstrated by an operator guiding a robot. The proposed system allows for semisupervision of task learning and is able to adapt the environmental partitioning to the complexity of the desired behavior. A real dialogue based on actions emerges from the interactive teaching. The interaction leads the robot to autonomously build a precise sensory-motor dynamics that approximates the behavior of the teacher. The usability of the system is highlighted by experiments on real robots, in both indoor and outdoor environments. Accuracy measures are also proposed in order to evaluate the learned behavior as compared to the expected behavioral attractor. These measures, used first in a real experiment and then in a simulated experiment, demonstrate how a real interaction between the teacher and the robot influences the learning process.

[1]  Cristina Castejón,et al.  Compact Modeling Technique for Outdoor Navigation , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[2]  Philippe Gaussier,et al.  Robust Mapless Outdoor Vision-Based Navigation , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  P. Gaussiera,et al.  The visual homing problem : An example of robotics / biology cross fertilization , 1999 .

[4]  R. Pfeifer,et al.  A mobile robot employing insect strategies for navigation , 2000, Robotics Auton. Syst..

[5]  José del R. Millán,et al.  Using Machine Learning Techniques in Real-World Mobile Robots , 1995, IEEE Expert.

[6]  Verena V. Hafner,et al.  Cognitive Maps in Rats and Robots , 2005, Adapt. Behav..

[7]  Nestor A. Schmajuk,et al.  Exploration, Navigation and Cognitive Mapping , 2000, Adapt. Behav..

[8]  Sven Koenig,et al.  Improved analysis of greedy mapping , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[9]  Hanspeter A. Mallot,et al.  Vision-based robot homing in dynamic environments , 2007 .

[10]  Jean-Arcady Meyer,et al.  Real-time visual loop-closure detection , 2008, 2008 IEEE International Conference on Robotics and Automation.

[11]  Philippe Gaussier,et al.  From Navigation to Active Object Recognition , 2000 .

[12]  Christophe Giovannangeli Navigation bio-mimétique autonome en environnements intérieurs et extérieurs : apprentissage sensori-moteur et planification dans un cadre interactif , 2007 .

[13]  Philippe Gaussier,et al.  PerAc: A neural architecture to control artificial animals , 1995, Robotics Auton. Syst..

[14]  Yukie Nagai,et al.  Does Disturbance Discourage People from Communicating with a Robot? , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[15]  Dominic Létourneau,et al.  Autonomous spherical mobile robot for child-development studies , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[16]  Sung-Bae Cho,et al.  Mixed-Initiative Human–Robot Interaction Using Hierarchical Bayesian Networks , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[17]  Philippe Gaussier,et al.  Planification versus sensory-motor conditioning: what are the issues? , 1998 .

[18]  Gordon Cheng,et al.  Discovering optimal imitation strategies , 2004, Robotics Auton. Syst..

[19]  Aude Billard,et al.  What is the Teacher"s Role in Robot Programming by Demonstration? - Toward Benchmarks for Improved Learning , 2007 .

[20]  Kerstin Dautenhahn,et al.  Getting to know each other - Artificial social intelligence for autonomous robots , 1995, Robotics Auton. Syst..

[21]  Pierre-Yves Oudeyer,et al.  Maximizing Learning Progress: An Internal Reward System for Development , 2003, Embodied Artificial Intelligence.

[22]  T. S. Collett,et al.  Landmark learning in bees , 1983, Journal of comparative physiology.

[23]  Aude Billard,et al.  Learning human arm movements by imitation: : Evaluation of a biologically inspired connectionist architecture , 2000, Robotics Auton. Syst..

[24]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[25]  Cristina Urdiales,et al.  A New Efficiency-Weighted Strategy for Continuous Human/Robot Cooperation in Navigation , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[26]  Mark Steedman Formalizing Affordance , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[27]  M. Dogar,et al.  Afford or Not to Afford : A New Formalization of Affordances Toward Affordance-Based Robot , 2007 .

[28]  Bernhard Schölkopf,et al.  View-Based Cognitive Mapping and Path Planning , 1995, Adapt. Behav..

[29]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[30]  A. Noë,et al.  A sensorimotor account of vision and visual consciousness. , 2001, The Behavioral and brain sciences.

[31]  Joachim Hertzberg,et al.  The MACS Project: An Approach to Affordance-Inspired Robot Control , 2006, Towards Affordance-Based Robot Control.

[32]  Sebastian Thrun,et al.  Robotic mapping: a survey , 2003 .

[33]  Philippe Gaussier,et al.  Living in a partially structured environment: How to bypass the limitations of classical reinforcement techniques , 1997, Robotics Auton. Syst..

[34]  Pierre-Yves Oudeyer,et al.  Intelligent Adaptive Curiosity: a source of Self-Development , 2004 .

[35]  Maja J. Matarić,et al.  Behavior-Based Segmentation of Demonstrated Task , 2006 .

[36]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[37]  Philippe Gaussier,et al.  Autonomous vision-based navigation: Goal-oriented action planning by transient states prediction, cognitive map building, and sensory-motor learning , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[39]  M. Matarić,et al.  Task Learning Through Imitation and Human-Robot Interaction , 2004 .

[40]  Philippe Muller,et al.  Mondes animaux et monde humain ; suivi de, Théorie de la signification , 1965 .

[41]  Eduardo Zalama Casanova,et al.  Adaptive behavior navigation of a mobile robot , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[42]  Chrystopher L. Nehaniv,et al.  Imitation with ALICE: learning to imitate corresponding actions across dissimilar embodiments , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[43]  Marina Basu The Embodied Mind: Cognitive Science and Human Experience , 2004 .

[44]  John Stewart The implications for understanding high-level cognition of a grounding in elementary adaptive systems , 1995, Robotics Auton. Syst..

[45]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[46]  Lincoln Smith,et al.  Navigation in Large-Scale Environments Using an Augmented Model of Visual Homing , 2006, SAB.

[47]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[48]  Sebastian Thrun,et al.  Issues in Using Function Approximation for Reinforcement Learning , 1999 .

[49]  Philippe Gaussier,et al.  Orientation system in Robots: Merging Allothetic and Idiothetic Estimations , 2007 .

[50]  Sorin Moga,et al.  Learning and communication via imitation: an autonomous robot perspective , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[51]  Philippe Gaussier,et al.  Emotional interactions as a way to structure learning , 2007 .

[52]  Philippe Gaussier,et al.  Human-Robot Interactions as a Cognitive Catalyst for the Learning of Behavioral Attractors , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[53]  Steven J. Bradtke,et al.  Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.

[54]  Monica N. Nicolescu,et al.  Learning and interacting in human-robot domains , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[55]  Verena V. Hafner,et al.  Learning Places in Newly Explored Environments , 2000 .

[56]  Lincoln Smith,et al.  Linked Local Navigation for Visual Route Guidance , 2007, Adapt. Behav..

[57]  Terrence Fong,et al.  Robot, asker of questions , 2003, Robotics Auton. Syst..

[58]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[59]  Alan Liu,et al.  A Flexible Architecture for Navigation Control of a Mobile Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[60]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[61]  M. Arbib,et al.  Motivational Learning of Spatial Behavior , 1977 .

[62]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[63]  Pierre-Yves Oudeyer,et al.  The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .

[64]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[65]  Masaki Ogino,et al.  Mapping Facial Expression to Internal States Based on Intuitive Parenting , 2007, J. Robotics Mechatronics.

[66]  F. Varela Invitation aux sciences cognitives , 1996 .

[67]  Philippe Gaussier,et al.  Learning new behaviors : Toward a Control Architecture merging Spatial and Temporal modalities , 2009 .

[68]  Antonis A. Argyros,et al.  Robot Homing by Exploiting Panoramic Vision , 2005, Auton. Robots.

[69]  Rémi Munos,et al.  Reinforcement learning with dynamic covering of state-action space: partitioning Q-learning , 1994 .

[70]  Sorin Moga,et al.  From Perception-Action Loops to Imitation Processes: A Bottom-Up Approach of Learning by Imitation , 1998, Appl. Artif. Intell..

[71]  Monica N. Nicolescu,et al.  Task Learning through Imitation and Human-robot Interaction , 2005 .

[72]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[73]  Aude Billard,et al.  Active Teaching in Robot Programming by Demonstration , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[74]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[75]  Philippe Gaussier,et al.  Learning Invariant Sensorimotor Behaviors: A Developmental Approach to Imitation Mechanisms , 2004, Adapt. Behav..

[76]  Av Adolphe Chauvin Robustness of Visual Place Cells in Dynamic Indoor and Outdoor Environment , 2006 .

[77]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[78]  C. Giovannangeli,et al.  About the constuctivist role of self-evaluation for interactive learnings and self-development , 2008 .

[79]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[80]  Leslie Pack Kaelbling,et al.  Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[81]  J. Demiris,et al.  Human-robot-communication and Machine Learning Abbr. Title: Human-robot-communication and Ml , 1997 .

[82]  T. Duckett,et al.  Topological localization for mobile robots using omni-directional vision and local features , 2004 .

[83]  Jürgen Schmidhuber,et al.  Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.

[84]  Philippe Gaussier,et al.  From view cells and place cells to cognitive map learning: processing stages of the hippocampal system , 2002, Biological Cybernetics.

[85]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[86]  A. Chemero An Outline of a Theory of Affordances , 2003, How Shall Affordances be Refined? Four Perspectives.

[87]  Philippe Gaussier,et al.  Emotion understanding: robots as tools and models , 2004 .

[88]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[89]  Tom Duckett,et al.  Localization for Mobile Robots using Panoramic Vision, Local Features and Particle Filter , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[90]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[91]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[92]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[93]  Maja J. Mataric,et al.  Integration of representation into goal-driven behavior-based robots , 1992, IEEE Trans. Robotics Autom..