Imitation learning from multiple demonstrators using global vision

Imitation learning enables a learner to expand its own skill set with behaviours that it observes from others. Most imitation learning systems learn from a single class of demonstrators, and often only a single demonstrator. Such approaches are limited, however: in the real world, people have varying levels of skills and different approaches to solving problems, and learning from only one demonstrator would be a very limited perspective. In the context of robots, very different physiologies make learning from many types of demonstrators equally important. A wheeled robot may watch a humanoid perform a task, for example, and yet not be able to perfectly approximate its movements (e.g. stepping over small obstacles). This thesis describes an approach to learning a task by observing demonstrations performed by multiple heterogeneous robots using global (overhead) vision, incorporating demonstrators that are different in size, physiology (wheeled vs. legged), and skill level. The imitator evaluates demonstrators relative to each other, which gives it the ability to weigh its learning towards the more skilled demonstrators. I assume the imitator has no initial knowledge of the observable effects of its own actions, and begin by training a set of Hidden Markov Models (HMMs) to map observations to

[1]  Yangsheng Xu,et al.  Hidden Markov Model for Control Strategy Learning , 1994 .

[2]  R. Byrne Imitation without intentionality. Using string parsing to copy the organization of behaviour , 1999, Animal Cognition.

[3]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[4]  Aude Billard,et al.  Play, Dreams and Imitation in Robota , 2000 .

[5]  Yiannis Demiris,et al.  Learning Forward Models for Robots , 2005, IJCAI.

[6]  C. Breazeal,et al.  Robots that imitate humans , 2002, Trends in Cognitive Sciences.

[7]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Christoph Neukirchen,et al.  A comparison between continuous and discrete density hidden Markov models for cursive handwriting recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[10]  Stela H. Seo,et al.  Humanoid Robots: Storm, Rogue, and Beast , 2011 .

[11]  Andrew W. Moore,et al.  K-means and Hierarchical Clustering , 2004 .

[12]  A. Meltzoff,et al.  Explaining Facial Imitation: A Theoretical Model. , 1997, Early development & parenting.

[13]  Aude Billard,et al.  Learning of Gestures by Imitation in a Humanoid Robot , 2007 .

[14]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[15]  Michael G. Dyer,et al.  Observation and Imitation: Goal Sequence Learning in Neurally Controlled Construction Animats: VI-MA , 2000 .

[16]  G. Rizzolatti,et al.  Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex , 2003, The European journal of neuroscience.

[17]  Yangsheng Xu,et al.  Human action learning via hidden Markov model , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[18]  Maja J. Mataric,et al.  Automated derivation of behavior vocabularies for autonomous humanoid motion , 2003, AAMAS '03.

[19]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[20]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[21]  B. Tanner,et al.  Peer Reinforcement in Homogeneous and Heterogeneous Multi-agent Learning , 2002 .

[22]  Maja J. Matarić,et al.  Primitive-Based Movement Classification for Humanoid Imitation , 2000 .

[23]  Kerstin Dautenhahn,et al.  Challenges in Building Robots That Imitate People , 2002 .

[24]  Jacky Baltes,et al.  Intelligent Global Vision for Teams of Mobile Robots , 2007 .

[25]  Aude Billard,et al.  Experiments in Learning by Imitation - Grounding and Use of Communication in Robotic Agents , 1999, Adapt. Behav..

[26]  Yoshihiko Nakamura,et al.  Acquiring Motion Elements for Bidirectional Computation of Motion Recognition and Generation , 2002, ISER.

[27]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[28]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[29]  Manuela M. Veloso,et al.  Coaching a simulated soccer team by opponent model recognition , 2001, AGENTS '01.

[30]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[31]  Chrystopher L. Nehaniv,et al.  Sensory-Motor Primitives as a Basis for Imitation: Linking Perception to Action and Biology to Robotics , 2002 .

[32]  G. Rizzolatti,et al.  Premotor cortex and the recognition of motor actions. , 1996, Brain research. Cognitive brain research.

[33]  Henry A. Kautz A formal theory of plan recognition , 1987 .

[34]  Aude Billard,et al.  A biologically inspired robotic model for learning by imitation , 2000, AGENTS '00.

[35]  Maja J. Mataric,et al.  Getting Humanoids to Move and Imitate , 2000, IEEE Intell. Syst..

[36]  Kerstin Dautenhahn,et al.  Of hummingbirds and helicopters: An algebraic framework for interdisciplinary studies of imitation a , 2000 .

[37]  Maja J. Mataric,et al.  Experiments in imitation using perceptuo-motor primitives , 2000, AGENTS '00.

[38]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[39]  Shaogang Gong,et al.  Auto clustering for unsupervised learning of atomic gesture components using minimum description length , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[40]  Yoshihiko Nakamura,et al.  Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..

[41]  Chrystopher L. Nehaniv,et al.  Imitation as a Dual-Route Process Featuring Predictive and Learning Components: A Biologically Plausible Computational Model , 2002 .

[42]  Yasuhiro Tanaka,et al.  Speed up reinforcement learning between two agents with adaptive mimetism , 1997, Proceedings of the 1997 IEEE/RSJ International Conference on Intelligent Robot and Systems. Innovative Robotics for Real-World Applications. IROS '97.

[43]  C. Boutilier,et al.  Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..

[44]  Geir Hovland,et al.  Skill acquisition from human demonstration using a hidden Markov model , 1996, Proceedings of IEEE International Conference on Robotics and Automation.