Confidence-Based Multi-Robot Learning from Demonstration

Learning from demonstration algorithms enable a robot to learn a new policy based on demonstrations provided by a teacher. In this article, we explore a novel research direction, multi-robot learning from demonstration, which extends demonstration based learning methods to collaborative multi-robot domains. Specifically, we study the problem of enabling a single person to teach individual policies to multiple robots at the same time. We present flexMLfD, a task and platform independent multi-robot demonstration learning framework that supports both independent and collaborative multi-robot behaviors. Building upon this framework, we contribute three approaches to teaching collaborative multi-robot behaviors based on different information sharing strategies, and evaluate these approaches by teaching two Sony QRIO humanoid robots to perform three collaborative ball sorting tasks. We then present scalability analysis of flexMLfD using up to seven Sony AIBO robots. We conclude the article by proposing a formalization for a broader multi-robot learning from demonstration research area.

[1]  Maja J. Matarić,et al.  Principled Approaches to the Design of Multi-Robot Systems , 2004 .

[2]  Manuela M. Veloso,et al.  Teaching multi-robot coordination using demonstration of communication and state sharing , 2008, AAMAS.

[3]  K. Dautenhahn,et al.  Do as I Do: Correspondences across Different Robotic Embodiments , 2002 .

[4]  C.W. Nielsen,et al.  Using mixed-initiative human-robot interaction to bound performance in a search task , 2008, 2008 International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[5]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[6]  Sascha Ossowski,et al.  On coordination and its significance to distributed and multi-agent systems: Research Articles , 2006 .

[7]  Cynthia Breazeal,et al.  Proceedings of the ACM/IEEE international conference on Human-robot interaction , 2007 .

[8]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[9]  Michael Lewis,et al.  Human control for cooperating robot teams , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[11]  Luís Nunes,et al.  On Learning by Exchanging Advice , 2002, ArXiv.

[12]  Paul E. Utgoff,et al.  On integrating apprentice learning and reinforcement learning , 1996 .

[13]  Michael J. Mayo,et al.  Symbol Grounding and its Implications for Artificial Intelligence , 2003, ACSC.

[14]  Maja J. Mataric,et al.  Principled Communication for Dynamic Multi-robot Task Allocation , 2000, ISER.

[15]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[16]  Sascha Ossowski,et al.  On coordination and its significance to distributed and multi‐agent systems , 2006, Concurr. Comput. Pract. Exp..

[17]  Jean Scholtz,et al.  Evaluation of a human-robot interface: development of a situational awareness methodology , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[18]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[19]  Aaron Steinfeld,et al.  Interface lessons for fully and semi-autonomous mobile robots , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[20]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[21]  Andrea Lockerd Thomaz,et al.  Teaching and working with robots as a collaboration , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[22]  Karl Tuyls,et al.  An Overview of Cooperative and Competitive Multiagent Learning , 2005, LAMAS.

[23]  K. Dautenhahn,et al.  Imitation in Animals and Artifacts , 2002 .

[24]  Michael A. Goodrich,et al.  Human-Robot Interaction: A Survey , 2008, Found. Trends Hum. Comput. Interact..

[25]  Manuela M. Veloso,et al.  A real-time world model for multi-robot teams with high-latency communication , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[26]  Vijay Kumar,et al.  Dynamic role assignment for cooperative robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[27]  Nidhi Kalra,et al.  Market-Based Multirobot Coordination: A Survey and Analysis , 2006, Proceedings of the IEEE.

[28]  Brett Browning,et al.  Skill Acquisition and Use for a Dynamically-Balancing Soccer Robot , 2004, AAAI.

[29]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[30]  Maja J. Matarić,et al.  Sensory-motor primitives as a basis for imitation: linking perception to action and biology to robotics , 2002 .

[31]  Stefan Schaal,et al.  Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[32]  Tucker R. Balch,et al.  Communication in reactive multiagent robotic systems , 1995, Auton. Robots.

[33]  Michael A. Goodrich,et al.  Seven principles of efficient human robot interaction , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[34]  Manuela M. Veloso,et al.  Interactive robot task training through dialog and demonstration , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[35]  C. Boutilier,et al.  Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..

[36]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[37]  John D. Lee,et al.  Trust in Automation: Designing for Appropriate Reliance , 2004 .

[38]  Manuela M. Veloso,et al.  Multi-thresholded approach to demonstration selection for interactive robot learning , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[39]  Chrystopher L. Nehaniv,et al.  Sensory-Motor Primitives as a Basis for Imitation: Linking Perception to Action and Biology to Robotics , 2002 .

[40]  Daniele Nardi,et al.  Multirobot systems: a classification focused on coordination , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  Daniel J. Garland,et al.  Situation Awareness Analysis and Measurement , 2009 .

[42]  Terrence Fong,et al.  Robot, asker of questions , 2003, Robotics Auton. Syst..

[43]  Jessica K. Hodgins,et al.  Generalizing Demonstrated Manipulation Tasks , 2002, WAFR.

[44]  Sonia Chernova,et al.  Confidence-based robot policy learning from demonstration , 2009 .

[45]  Michael Luck,et al.  AAMAS '03: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems , 2003 .

[46]  Aude Billard,et al.  Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations , 2008, IEEE Transactions on Robotics.

[47]  Gordon Cheng,et al.  Learning to Act from Observation and Practice , 2004, Int. J. Humanoid Robotics.

[48]  Enrico Pagello,et al.  Cooperative behaviors in multi-robot systems through implicit communication , 1999, Robotics Auton. Syst..

[49]  Jean Scholtz,et al.  Common metrics for human-robot interaction , 2006, HRI '06.

[50]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[51]  C. Ronald Kube,et al.  Task Modelling in Collective Robotics , 1997, Auton. Robots.

[52]  Andrea Lockerd Thomaz,et al.  Tutelage and socially guided robot learning , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[53]  Michael A. Goodrich,et al.  Validating human-robot interaction schemes in multitasking environments , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[54]  Stefan Schaal,et al.  Reinforcement Learning for Humanoid Robotics , 2003 .

[55]  Alois Knoll,et al.  The roles of haptic-ostensive referring expressions in cooperative, task-based human-robot dialogue , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[56]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).