论文信息 - Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation

Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation

The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal. We present Cascaded Variational Inference (CAVIN) Planner, a model-based method that hierarchically generates plans by sampling from latent spaces. To facilitate planning over long time horizons, our method learns latent representations that decouple the prediction of high-level effects from the generation of low-level motions through cascaded variational inference. This enables us to model dynamics at two different levels of temporal resolutions for hierarchical planning. We evaluate our approach in three multi-step robotic manipulation tasks in cluttered tabletop environments given high-dimensional observations. Empirical results demonstrate that the proposed method outperforms state-of-the-art model-based methods by strategically interacting with multiple objects.

[1] W. Köhler. The Mentality of Apes. , 2018, Nature.

[2] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.

[3] Marko Bacic,et al. Model predictive control , 2003 .

[4] Dirk P. Kroese,et al. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .

[5] Lih-Yuan Deng,et al. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.

[6] A. Kacelnik,et al. Cognitive Processes Associated with Sequential Tool Use in New Caledonian Crows , 2009, PloS one.

[7] Leslie Pack Kaelbling,et al. Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8] David Q. Mayne,et al. Model predictive control: Recent developments and future promise , 2014, Autom..

[9] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[10] Pieter Abbeel,et al. Combined task and motion planning through an extensible planner-independent interface layer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[11] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.

[12] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14] Siddhartha S. Srinivasa,et al. The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[15] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[17] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[18] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.

[19] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.

[20] Jitendra Malik,et al. Combining self-supervised learning and imitation for vision-based rope manipulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22] Gaurav S. Sukhatme,et al. Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.

[23] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Aviv Tamar,et al. Imitation Learning from Visual Data with Multiple Intentions , 2018, ICLR.

[25] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.

[26] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27] Pieter Abbeel,et al. Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.

[28] Silvio Savarese,et al. Learning task-oriented grasping for tool manipulation from simulated self-supervision , 2018, Robotics: Science and Systems.

[29] Marco Pavone,et al. Learning Sampling Distributions for Robot Motion Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[30] Philip S. Thomas,et al. Learning Action Representations for Reinforcement Learning , 2019, ICML.

[31] Silvio Savarese,et al. Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[32] Marco Pavone,et al. Robot Motion Planning in Learned Latent Spaces , 2018, IEEE Robotics and Automation Letters.

[33] P. Alam,et al. R , 1823, The Herodotus Encyclopedia.