论文信息 - Latent Space Roadmap for Visual Action Planning of Deformable and Rigid Object Manipulation

Latent Space Roadmap for Visual Action Planning of Deformable and Rigid Object Manipulation

We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces such as manipulation of deformable objects. Planning is performed in a low-dimensional latent state space that embeds images. We define and implement a Latent Space Roadmap (LSR) which is a graph-based structure that globally captures the latent system dynamics. Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them. We show the effectiveness of the method on a simulated box stacking task as well as a T-shirt folding task performed with a real robot.

[1] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.

[2] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[3] Bodo Rosenhahn,et al. Structuring Autoencoders , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[4] Leslie Pack Kaelbling,et al. Integrated task and motion planning in belief space , 2013, Int. J. Robotics Res..

[5] Stanley J. Rosenschein,et al. Formal theories of knowledge in AI and robotics , 1986, New Generation Computing.

[6] Guillaume Desjardins,et al. Understanding disentangling in $\beta$-VAE , 2018, 1804.03599.

[7] Katsu Yamane,et al. Deep Imitation Learning of Sequential Fabric Smoothing Policies , 2019, ArXiv.

[8] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Dinesh Manocha,et al. I-cloth , 2018, ACM Trans. Graph..

[11] Jitendra Malik,et al. Combining self-supervised learning and imitation for vision-based rope manipulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[12] Aaron C. Courville,et al. Adversarially Learned Inference , 2016, ICLR.

[13] Marco Pavone,et al. Robot Motion Planning in Learned Latent Spaces , 2018, IEEE Robotics and Automation Letters.

[14] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[15] David Filliat,et al. State Representation Learning for Control: An Overview , 2018, Neural Networks.

[16] Pieter Abbeel,et al. Learning Robotic Manipulation through Visual Planning and Acting , 2019, Robotics: Science and Systems.

[17] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[18] Wolfram Burgard,et al. Probabilistic Methods for State Estimation in Robotics , 1999 .

[19] Leslie Pack Kaelbling,et al. A constraint-based method for solving sequential manipulation planning problems , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[21] Aric Hagberg,et al. Exploring Network Structure, Dynamics, and Function using NetworkX , 2008 .

[22] Guillaume Desjardins,et al. Understanding disentangling in β-VAE , 2018, ArXiv.

[23] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[24] Steven M. LaValle,et al. Planning algorithms , 2006 .

[25] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[26] John Canny,et al. Deep Imitation Learning of Sequential Fabric Smoothing From an Algorithmic Supervisor , 2019 .