论文信息 - Motion Planning With Success Judgement Model Based on Learning From Demonstration

Motion Planning With Success Judgement Model Based on Learning From Demonstration

A technique named Learning from Demonstration allows robots to learn actions in a human living environment from the demonstrations directly. In a learning method from demonstrations directly, however, teaching actions cannot be reused between situations with different restrictions. In this study, we propose a method for training a success judgment model based on Learning from Demonstration and use this as a differentiable loss function of tasks. By formulating the constraints of the action in a manner in mathematical optimization and combining these constraints with the learned success judgment model into a loss function, an action generation model can be trained by the gradient method. This system was verified with the action of scooping up a pancake.

Toshiaki Tsuji | Sho Sakaino | Daichi Furuta | Kyo Kutsuzawa

[1] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[3] Shigeki Sugano,et al. Repeatable Folding Task by Humanoid Robot Worker Using Deep Learning , 2017, IEEE Robotics and Automation Letters.

[4] Toshiaki Tsuji,et al. Trajectory adjustment for nonprehensile manipulation using latent space of trained sequence-to-sequence model* , 2019, Adv. Robotics.

[5] Tsuyoshi Adachi,et al. Imitation Learning for Object Manipulation Based on Position/Force Information Using Bilateral Control , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6] John J. Craig,et al. Hybrid position/force control of manipulators , 1981 .

[7] 山田祐,et al. Open Dynamics Engine を用いたスノーボードロボットシミュレータの開発 , 2007 .

[8] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Franziska Meier,et al. SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Control , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[11] F. Gers,et al. Long short-term memory in recurrent neural networks , 2001 .

[12] Toshiaki Tsuji,et al. Sequence-to-sequence models for trajectory deformation of dynamic manipulation , 2017, IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society.

[13] Quang-Cuong Pham,et al. Dynamic non-prehensile object transportation , 2014, 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV).

[14] Marco Pavone,et al. Robot Motion Planning in Learned Latent Spaces , 2018, IEEE Robotics and Automation Letters.

[15] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16] Jake K. Aggarwal,et al. View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.

[18] Georgios Evangelidis,et al. Skeletal Quads: Human Action Recognition Using Joint Quadruples , 2014, 2014 22nd International Conference on Pattern Recognition.

[19] Toshiaki Tsuji,et al. Optimized Trajectory Generation based on Model Predictive Control for Turning Over Pancakes , 2018 .

[20] Noriaki Hirose,et al. Personal robot assisting transportation to support active human life — Posture stabilization based on feedback compensation of lateral acceleration , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21] Solvi Arnold,et al. EMD Net: An Encode–Manipulate–Decode Network for Cloth Manipulation , 2018, IEEE Robotics and Automation Letters.

[22] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[23] Wenjun Zeng,et al. Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks , 2016, ECCV.

[24] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[25] Toshiaki Tsuji,et al. A Cooking Support System with Force Visualization Using Force Sensors and an RGB-D Camera , 2016, AsiaHaptics.

[26] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[27] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[28] Silvio Savarese,et al. Deep Visual MPC-Policy Learning for Navigation , 2019, IEEE Robotics and Automation Letters.

[29] Carme Torras,et al. A robot learning from demonstration framework to perform force-based manipulation tasks , 2013, Intelligent Service Robotics.

[30] Darwin G. Caldwell,et al. Upper-body kinesthetic teaching of a free-standing humanoid robot , 2011, 2011 IEEE International Conference on Robotics and Automation.

[31] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..

[32] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[33] Toru Ogawa,et al. Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[34] Gang Wang,et al. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[35] Xiaoou Tang,et al. Action Recognition and Detection by Combining Motion and Appearance Features , 2014 .

[36] Steven M. LaValle,et al. Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[37] Huosheng Hu,et al. Robot Learning from Demonstration in Robotic Assembly: A Survey , 2018, Robotics.

[38] Oussama Khatib,et al. A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[39] Kensuke Harada,et al. Deep Learning Scooping Motion Using Bilateral Teleoperations , 2018, 2018 3rd International Conference on Advanced Robotics and Mechatronics (ICARM).