论文信息 - Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also termed as sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations with respect to different coordinate systems describing virtual landmarks or objects of interest with a task-parameterized formulation, and adapt the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably lower sample and model complexity in contrast to deep learning approaches. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.

[1] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2] Christopher M. Bishop,et al. Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[3] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[4] Aaron F. Bobick,et al. Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Geoffrey J. McLachlan,et al. Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[6] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[7] Dana Kulic,et al. Incremental Learning, Clustering and Hierarchy Formation of Whole Body Motion Patterns using Adaptive Hidden Markov Chains , 2008, Int. J. Robotics Res..

[8] K. Dautenhahn,et al. Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions , 2009 .

[9] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[10] Shunzheng Yu,et al. Hidden semi-Markov models , 2010, Artif. Intell..

[11] Dongheui Lee,et al. Incremental motion primitive learning by physical coaching using impedance control , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12] D. Wolpert,et al. Principles of sensorimotor learning , 2011, Nature Reviews Neuroscience.

[13] Scott Niekum,et al. Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14] Michael I. Jordan,et al. Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[15] Jan Peters,et al. Probabilistic Movement Primitives , 2013, NIPS.

[16] Ke Jiang,et al. Small-Variance Asymptotics for Hidden Markov Models , 2013, NIPS.

[17] Jun Nakanishi,et al. Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[18] Michael I. Jordan,et al. MAD-Bayes: MAP-based Asymptotic Derivations from Bayes , 2012, ICML.

[19] Sudeep Sarkar,et al. A novel telerobotic method for human-in-the-loop assisted grasping based on intention recognition , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20] Jun Zhu,et al. DP-space: Bayesian Nonparametric Subspace Clustering with Small-variance Asymptotics , 2015, ICML.

[21] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[22] Ajay Kumar Tanwani,et al. Learning Robot Manipulation Tasks With Task-Parameterized Semitied Hidden Semi-Markov Model , 2016, IEEE Robotics and Automation Letters.

[23] Sylvain Calinon,et al. A tutorial on task-parameterized movement learning and retrieval , 2016, Intell. Serv. Robotics.

[24] Alberto Montebelli,et al. Learning in-contact control strategies from demonstration , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.

[26] Alberto Bemporad,et al. Predictive Control for Linear and Hybrid Systems , 2017 .

[27] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.

[28] Aude Billard,et al. Learning Stable Task Sequences from Demonstration with Linear Parameter Varying Systems and Hidden Markov Models , 2017, CoRL.

[29] Aude Billard,et al. Transform-Invariant Non-Parametric Clustering of Covariance Matrices and its Application to Unsupervised Joint Segmentation and Action Discovery , 2017, ArXiv.

[30] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.

[31] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.

[32] Silvio Savarese,et al. Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33] Ajay Kumar Tanwani,et al. Generative Models for Learning Robot Manipulation Skills from Humans , 2018 .

[34] Ajay Kumar Tanwani,et al. Small-variance asymptotics for non-parametric online robot learning , 2016, Int. J. Robotics Res..

[35] Chao Liu,et al. Haptics Electromyogrphy Perception and Learning Enhanced Intelligence for Teleoperated Robot , 2019, IEEE Transactions on Automation Science and Engineering.

[36] Ajay Kumar Tanwani,et al. Small-variance asymptotics for non-parametric online robot learning , 2019 .