论文信息 - A formalism for learning from demonstration

A formalism for learning from demonstration

The paper describes and formalizes the concepts and assumptions involved in Learning from Demonstration (LFD), a common learning technique used in robotics. LFD-related concepts like goal, generalization, and repetition are here defined, analyzed, and put into context. Robot behaviors are described in terms of trajectories through information spaces and learning is formulated as mappings between some of these spaces. Finally, behavior primitives are introduced as one example of good bias in learning, dividing the learning process into the three stages of behavior segmentation, behavior recognition, and behavior coordination. The formalism is exemplified through a sequence learning task where a robot equipped with a gripper arm is to move objects to specific areas. The introduced concepts are illustrated with special focus on how bias of various kinds can be used to enable learning from a single demonstration, and how ambiguities in demonstrations can be identified and handled.

Thomas Hellström | Erik Alexander Billing | T. Hellström | E. Billing

[1] Maja J. Matari,et al. Behavior-based Control: Examples from Navigation, Learning, and Group Behavior , 1997 .

[2] Gordon Cheng,et al. Discovering optimal imitation strategies , 2004, Robotics Auton. Syst..

[3] Stefan Schaal,et al. Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[4] Richard Alan Peters,et al. Robonaut task learning through teleoperation , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[5] Thomas Hellström,et al. Model-free Learning from Demonstration , 2010, ICAART.

[6] M. Matarić,et al. Behavior-Based Segmentation of Demonstrated Tasks , 2006 .

[7] Aude Billard,et al. Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations , 2008, IEEE Transactions on Robotics.

[8] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[9] Aude Billard,et al. What is the Teacher"s Role in Robot Programming by Demonstration? - Toward Benchmarks for Improved Learning , 2007 .

[10] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[11] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.

[12] Dana Kulic,et al. Incremental Learning, Clustering and Hierarchy Formation of Whole Body Motion Patterns using Adaptive Hidden Markov Chains , 2008, Int. J. Robotics Res..

[13] K. Doya,et al. A unifying computational framework for motor control and social interaction. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[14] Mitsuo Kawato,et al. MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.

[15] Stefan Schaal,et al. Policy Learning for Motor Skills , 2007, ICONIP.

[16] Ronald C. Arkin,et al. An Behavior-based Robotics , 1998 .

[17] Rolf Pfeifer,et al. Understanding intelligence , 2020, Inequality by Design.

[18] Gordon Cheng,et al. Learning Similar Tasks From Observation and Practice , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19] Brandon R. Rohrer. BECCA: A Brain Emulating Cognition and Control Architecture. , 2008 .

[20] Chrystopher L. Nehaniv,et al. Action, State and Effect Metrics for Robot Imitation , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[21] C. Breazeal,et al. Challenges in building robots that imitate people , 2002 .

[22] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[23] Maja J. Mataric,et al. Parametric primitives for motor representation and control , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[24] Chrystopher L. Nehaniv,et al. Teaching robot companions: the role of scaffolding and event structuring , 2008, Connect. Sci..

[25] Y. Demiris,et al. From motor babbling to hierarchical learning by imitation: a robot developmental pathway , 2005 .

[26] Peter Bakker,et al. Robot see, robot do: An overview of robot imitation , 1996 .

[27] Maja J. Matarić,et al. Behavior-Based Segmentation of Demonstrated Task , 2006 .

[28] Chrystopher L. Nehaniv,et al. Imitation with ALICE: learning to imitate corresponding actions across dissimilar embodiments , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[29] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[30] Ferdinando A. Mussa-Ivaldi,et al. Vector field approximation: a computational paradigm for motor control and learning , 1992, Biological Cybernetics.

[31] Aude Billard,et al. On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[32] L. Suchman. Plans and situated actions , 1987 .

[33] Maja J. Mataric,et al. Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[34] Nathan Delson,et al. Robot programming by human demonstration: the use of human inconsistency in improving 3D robot trajectories , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[35] Brian Scassellati,et al. Infant-like Social Interactions between a Robot and a Human Caregiver , 2000, Adapt. Behav..

[36] B. Rohrer,et al. A learning and control approach based on the human neuromotor system , 2006, The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006..

[37] Daniel M. Wolpert,et al. Hierarchical MOSAIC for movement generation , 2003 .

[38] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.

[39] R. Byrne,et al. Priming primates: Human and otherwise , 1998, Behavioral and Brain Sciences.

[40] Ferdinando A. Mussa-Ivaldi,et al. From basis functions to basis fields: vector field approximation from sparse data , 1992, Biological Cybernetics.

[41] Kerstin Dautenhahn,et al. Self-Imitation and Environmental Scaffolding for Robot Teaching , 2007 .

[42] Kerstin Dautenhahn,et al. Of hummingbirds and helicopters: An algebraic framework for interdisciplinary studies of imitation a , 2000 .

[43] Christopher G. Atkeson,et al. Learning from observation using primitives , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[44] Rolf Pfeifer,et al. Sensory - motor coordination: The metaphor and beyond , 1997, Robotics Auton. Syst..

[45] Steven M. LaValle,et al. Planning algorithms , 2006 .

[46] J. Bruner,et al. The role of tutoring in problem solving. , 1976, Journal of child psychology and psychiatry, and allied disciplines.

[47] Dare A. Baldwin,et al. Segmenting dynamic human action via statistical structure , 2008, Cognition.

[48] Paul R. Cohen,et al. Voting experts: An unsupervised algorithm for segmenting sequences , 2007, Intell. Data Anal..

[49] Monica N. Nicolescu,et al. Robot learning by demonstration using forward models of schema-based behaviors , 2005, ICINCO.

[50] Thomas Hellström,et al. Behavior recognition for Learning from Demonstration , 2010, 2010 IEEE International Conference on Robotics and Automation.

[51] Erik Billing,et al. Cognition Reversed : Robot Learning from Demonstration , 2009 .

[52] Henk Nijmeijer,et al. Robot Programming by Demonstration , 2010, SIMPAR.

[53] Erik Alexander Billing,et al. Cognitive Perspectives on Robot Behavior , 2010, ICAART.

[54] Herbert A. Simon,et al. The Sciences of the Artificial , 1970 .

[55] Maja J. Mataric,et al. Integration of representation into goal-driven behavior-based robots , 1992, IEEE Trans. Robotics Autom..

[56] David G. Stork,et al. Pattern classification, 2nd Edition , 2000 .

[57] F. A. Mussa-lvaldi,et al. Convergent force fields organized in the frog's spinal cord , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[58] R A Brooks,et al. New Approaches to Robotics , 1991, Science.

[59] Ran,et al. The correspondence problem , 1998 .

[60] Jun Tani,et al. Self-organization of behavioral primitives as multiple attractor dynamics: A robot experiment , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[61] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[62] Tom M. Mitchell,et al. The Need for Biases in Learning Generalizations , 2007 .

[63] Gordon Cheng,et al. Discovering imitation strategies through categorization of multi-dimensional data , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[64] Philippe Gaussier,et al. Human-Robot Interactions as a Cognitive Catalyst for the Learning of Behavioral Attractors , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[65] Biao Huang,et al. System Identification , 2000, Control Theory for Physicists.

[66] Y. Nakamura,et al. Symbolic memory for humanoid robots using hierarchical bifurcations of attractors in nonmonotonic neural networks , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[67] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[68] Yiannis Demiris,et al. Do Robots Ape , 1997 .

[69] J. Tani. On the Interactions Between Top-Down Anticipation and Bottom-Up Regression , 2007, Frontiers in neurorobotics.

[70] Ulrich Nehmzow,et al. "Programming" by Teaching: Neural Network Control in the Manchester Mobile Robot , 1995 .

[71] Maja J. Mataric,et al. Designing and Understanding Adaptive Group Behavior , 1995, Adapt. Behav..

[72] Chrystopher L. Nehaniv,et al. Using Self-Imitation to Direct Learning , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[73] C. Breazeal,et al. Robots that imitate humans , 2002, Trends in Cognitive Sciences.

[74] Yiannis Demiris,et al. Distributed, predictive perception of actions: a biologically inspired robotics architecture for imitation and learning , 2003, Connect. Sci..

[75] Kazuhito Yokoi,et al. Recognition and Generation of Leg Primitive Motions for Dance Imitation by a Humanoid Robot , 2003 .

[76] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.

[77] John McCarthy,et al. SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[78] Thomas Hellström,et al. Behavior recognition for segmentation of demonstrated tasks , 2008 .

[79] Maja J. Mataric,et al. Behaviour-based control: examples from navigation, learning, and group behaviour , 1997, J. Exp. Theor. Artif. Intell..

[80] Maja J. Matarić,et al. A framework for learning from demonstration, generalization and practice in human-robot domains , 2003 .

[81] B. Scassellati. Imitation and mechanisms of joint attention: a developmental structure for building social skills on a humanoid robot , 1999 .

[82] Thomas Hellström,et al. Development of an Autonomous Forest Machine for Path Tracking , 2005, FSR.

[83] Yiannis Demiris,et al. Hierarchical attentive multiple models for execution and recognition of actions , 2006, Robotics Auton. Syst..

[84] Aude Billard,et al. Recognition and reproduction of gestures using a probabilistic framework combining PCA, ICA and HMM , 2005, ICML.

[85] Jon Rigelsford,et al. Behaviour‐based Robotics , 2001 .

[86] Dana H. Ballard,et al. Recognizing teleoperated manipulations , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.

[87] Yoshihiko Nakamura,et al. Segmentation, Memorization, Recognition and Abstraction of Humanoid Motions Based on Correlations and Associative Memory , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[88] Henry Lieberman,et al. Watch what I do: programming by demonstration , 1993 .