Auto-conditioned Recurrent Mixture Density Networks for Complex Trajectory Generation

Recent advancements in machine learning research have given rise to recurrent neural networks that are able to synthesize high-dimensional motion sequences over long time horizons. By leveraging these sequence learning techniques, we introduce a state transition model (STM) that is able to learn a variety of complex motion sequences in joint position space. Given few demonstrations from a motion planner, we show in real robot experiments that the learned STM can quickly generalize to unseen tasks. Our approach enables the robot to accomplish complex behaviors from high-level instructions that would require laborious hand-engineered sequencing of trajectories with traditional motion planners. A video of our experiments is available at this https URL

[1]  Mike Schuster,et al.  Better Generative Models for Sequential Data Problems: Bidirectional Recurrent Mixture Density Networks , 1999, NIPS.

[2]  Dylan A. Shell,et al.  Extending Open Dynamics Engine for Robotics Simulation , 2010, SIMPAR.

[3]  Min Tan,et al.  Sequential learning for multimodal 3D human activity recognition with Long-Short Term Memory , 2017, 2017 IEEE International Conference on Mechatronics and Automation (ICMA).

[4]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[5]  Stefano Ermon,et al.  InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.

[6]  Yi Zhou,et al.  Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis , 2017, ICLR.

[7]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[8]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[9]  Siddhartha S. Srinivasa,et al.  Batch Informed Trees (BIT*): Sampling-based optimal planning via the heuristically guided search of implicit random geometric graphs , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[11]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[12]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[13]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[14]  Rouhollah Rahmatizadeh,et al.  From Virtual Demonstration to Real-World Manipulation Using LSTM and MDN , 2016, AAAI.

[15]  Keisuke Takayama,et al.  Simulation, Modeling, and Programming for Autonomous Robots , 2012, Lecture Notes in Computer Science.

[16]  Carl E. Rasmussen,et al.  Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[18]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[19]  Gaurav S. Sukhatme,et al.  Scaling simulation-to-real transfer by learning composable robot skills , 2018, ISER.

[20]  Martial Hebert,et al.  Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.

[21]  Yoshua Bengio,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[22]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[23]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[24]  Marc Toussaint,et al.  Towards Combining Motion Optimization and Data Driven Dynamical Models for Human Motion Prediction , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[25]  Gaurav S. Sukhatme,et al.  Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.

[26]  Rouhollah Rahmatizadeh,et al.  Learning Manipulation Trajectories Using Recurrent Neural Networks , 2016, ArXiv.

[27]  Erwin Coumans,et al.  Bullet physics simulation , 2015, SIGGRAPH Courses.

[28]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[29]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[30]  Junichi Yamagishi,et al.  An autoregressive recurrent mixture density network for parametric speech synthesis , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[32]  Michael C. Yip,et al.  Motion Planning Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[33]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[34]  Zhi Yan,et al.  3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[35]  C. Bishop Mixture density networks , 1994 .

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[38]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[39]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[40]  Andrew J. Davison,et al.  Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.