Boosting Robot Learning and Control with Domain Constraints

In this short paper, we exploit robotics domain knowledge, be it acquired from humans or self-organization, to alleviate learning and control challenges from directly dealing with raw demonstrations or sparse reward signals. We take a unified latent variable perspective in incorporating domain constraints. The latent variables are regarded as task parameters or representations, which rationalize task observations with a generative model. The constraints can thus be specified with structured latent variables. Different from many related works, we explore latent structures that are computationally feasible and robotics-oriented to facilitate both task learning and control synthesis. The paper will briefly discuss adopted structures ranging from parameter dependency, modality and dynamical associativity, extending imitation learning such as inverse optimal control and deep generative models. The framework is shown to be effective in a range of manipulation tasks, including 1) learning variable impedance controllers in robotic handwriting; 2) boosting motion synthesis for writing novel symbols; 3) reasoning an internal model to score a balltarget under malfunctioning visual input.

[1]  Ana Paiva,et al.  Learning cost function and trajectory for robotic writing motion , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[2]  Aude Billard,et al.  Learning motions from demonstrations and rewards with time-invariant dynamical systems based policies , 2018, Auton. Robots.

[3]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[4]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Diederik P. Kingma,et al.  Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .

[6]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[7]  Sylvain Calinon,et al.  Robot Learning with Task-Parameterized Generative Models , 2015, ISRR.

[8]  A. Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[9]  Subramanian Ramamoorthy,et al.  Motion planning and reactive control on learnt skill manifolds † , 2013, Int. J. Robotics Res..

[10]  Pieter Abbeel,et al.  Mutual Alignment Transfer Learning , 2017, CoRL.

[11]  John J. Craig,et al.  Hybrid position/force control of manipulators , 1981 .

[12]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[13]  Ana Paiva,et al.  Associate Latent Encodings in Learning from Demonstrations , 2017, AAAI.

[14]  Sandra Hirche,et al.  Risk-sensitive interaction control in uncertain manipulation tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.

[15]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[16]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[17]  Darwin G. Caldwell,et al.  Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.