Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also called sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations observed with respect to different coordinate systems describing virtual landmarks or objects of interest, and adapts the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably low sample and model complexity for acquiring new manipulation skills. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.

[1]  Dana Kulic,et al.  Incremental Learning, Clustering and Hierarchy Formation of Whole Body Motion Patterns using Adaptive Hidden Markov Chains , 2008, Int. J. Robotics Res..

[2]  Aude Billard,et al.  Transform-Invariant Non-Parametric Clustering of Covariance Matrices and its Application to Unsupervised Joint Segmentation and Action Discovery , 2017, ArXiv.

[3]  Marcin Andrychowicz,et al.  One-Shot Imitation Learning , 2017, NIPS.

[4]  Shunzheng Yu,et al.  Hidden semi-Markov models , 2010, Artif. Intell..

[5]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[6]  Aude Billard,et al.  Learning Stable Task Sequences from Demonstration with Linear Parameter Varying Systems and Hidden Markov Models , 2017, CoRL.

[7]  D. Wolpert,et al.  Principles of sensorimotor learning , 2011, Nature Reviews Neuroscience.

[8]  Sylvain Calinon,et al.  A tutorial on task-parameterized movement learning and retrieval , 2016, Intell. Serv. Robotics.

[9]  Michael I. Jordan,et al.  Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[10]  Pieter Abbeel,et al.  An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.

[11]  Dawn Xiaodong Song,et al.  Parametrized Hierarchical Procedures for Neural Programming , 2018, ICLR.

[12]  Michael I. Jordan,et al.  MAD-Bayes: MAP-based Asymptotic Derivations from Bayes , 2012, ICML.

[13]  Shimon Whiteson,et al.  TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.

[14]  Scott Niekum,et al.  Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[16]  Ajay Kumar Tanwani,et al.  Small-variance asymptotics for non-parametric online robot learning , 2019 .

[17]  Ajay Kumar Tanwani,et al.  Generative Models for Learning Robot Manipulation Skills from Humans , 2018 .

[18]  Ion Stoica,et al.  DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.

[19]  Ke Jiang,et al.  Small-Variance Asymptotics for Hidden Markov Models , 2013, NIPS.

[20]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[21]  Dongheui Lee,et al.  Incremental motion primitive learning by physical coaching using impedance control , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[23]  Ajay Kumar Tanwani,et al.  Learning Robot Manipulation Tasks With Task-Parameterized Semitied Hidden Semi-Markov Model , 2016, IEEE Robotics and Automation Letters.

[24]  Silvio Savarese,et al.  Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Jan Peters,et al.  Probabilistic Movement Primitives , 2013, NIPS.

[26]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Alberto Bemporad,et al.  Predictive Control for Linear and Hybrid Systems , 2017 .

[28]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[29]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[30]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[31]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..