Learning Multiple Behaviors from Unlabeled Demonstrations in a Latent Controller Space

In this paper we introduce a method to learn multiple behaviors in the form of motor primitives from an unlabeled dataset. One of the difficulties of this problem is that in the measurement space, behaviors can be very mixed, despite existing a latent representation where they can be easily separated. We propose a mixture model based on a Dirichlet Process (DP) to simultaneously cluster the observed time-series and recover a sparse representation of the behaviors using a Laplacian prior as the base measure of the DP. We show that for linear models, e.g potential functions generated by linear combinations of a large number of features, it is possible to compute analytically the marginal of the observations and derive an efficient sampler. The method is evaluated using robot behaviors and real data from human motion and compared to other techniques.

[1]  Christos Dimitrakakis,et al.  Bayesian Multitask Inverse Reinforcement Learning , 2011, EWRL.

[2]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[3]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[4]  François Chaumette,et al.  Image moments: a general and useful set of features for visual servoing , 2004, IEEE Transactions on Robotics.

[5]  David B. Dunson,et al.  Multi-task compressive sensing with Dirichlet process priors , 2008, ICML '08.

[6]  Kee-Eung Kim,et al.  Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions , 2012, NIPS.

[7]  Aude Billard,et al.  BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models , 2010, 2010 IEEE International Conference on Robotics and Automation.

[8]  Marc Toussaint,et al.  Task Space Retrieval Using Inverse Feedback Control , 2011, ICML.

[9]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[10]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[11]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[12]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[13]  Joshua B. Tenenbaum,et al.  Nonparametric Bayesian Policy Priors for Reinforcement Learning , 2010, NIPS.

[14]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[15]  Sethu Vijayakumar,et al.  A novel method for learning policies from variable constraint data , 2009, Auton. Robots.

[16]  O. Jenkins,et al.  Incremental Nonparametric Bayesian Regression , 2008 .

[17]  Christopher Yau,et al.  Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination. , 2011, Bayesian analysis.

[18]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[19]  David R. Brillinger Learning a Potential Function From a Trajectory , 2007, IEEE Signal Processing Letters.

[20]  Barbara Majecka,et al.  Statistical models of pedestrian behaviour in the Forum , 2009 .

[21]  Jan Peters,et al.  Movement extraction by detecting dynamics switches and repetitions , 2010, NIPS.

[22]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[23]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[24]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[25]  Michael L. Littman,et al.  Apprenticeship Learning About Multiple Intentions , 2011, ICML.

[26]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.