Improvisation through Physical Understanding: Using Novel Objects as Tools with Visual Foresight

Machine learning techniques have enabled robots to learn narrow, yet complex tasks and also perform broad, yet simple skills with a wide variety of objects. However, learning a model that can both perform complex tasks and generalize to previously unseen objects and goals remains a significant challenge. We study this challenge in the context of "improvisational" tool use: a robot is presented with novel objects and a user-specified goal (e.g., sweep some clutter into the dustpan), and must figure out, using only raw image observations, how to accomplish the goal using the available objects as tools. We approach this problem by training a model with both a visual and physical understanding of multi-object interactions, and develop a sampling-based optimizer that can leverage these interactions to accomplish tasks. We do so by combining diverse demonstration data with self-supervised interaction data, aiming to leverage the interaction data to build generalizable models and the demonstration data to guide the model-based RL planner to solve complex tasks. Our experiments show that our approach can solve a variety of complex tool use tasks from raw pixel inputs, outperforming both imitation learning and self-supervised learning individually. Furthermore, we show that the robot can perceive and use novel objects as tools, including objects that are not conventional tools, while also choosing dynamically to use or not use tools depending on whether or not they are required.

[1]  S. Sastry,et al.  Task oriented optimal grasping by multifingered robot hands , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  D. Sherer Fetal grasping at 16 weeks' gestation , 1993, Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine.

[4]  S. Srihari Mixture Density Networks , 1994 .

[5]  Karun B. Shimoga,et al.  Robot Grasp Synthesis Algorithms: A Survey , 1996, Int. J. Robotics Res..

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[8]  T. Arai,et al.  Cooperative Manipulation of Objects by Multiple Mobile Robots with Tools * , 1998 .

[9]  Christiaan J. J. Paredis,et al.  Micro planning for mechanical assembly operations , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[10]  Jean-Claude Latombe,et al.  A General Framework for Assembly Planning: The Motion Space Approach , 1998, SCG '98.

[11]  Masayuki Inaba,et al.  Motion Planning for Humanoid Robots , 2003, ISRR.

[12]  Dirk P. Kroese,et al.  Cross‐Entropy Method , 2011 .

[13]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[14]  Stefano Caselli,et al.  Grasp recognition in virtual reality for robot pregrasp planning by demonstration , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[15]  G. Swaminathan Robot Motion Planning , 2006 .

[16]  Yoshihiko Nakamura,et al.  Association of whole body motion from tool knowledge for humanoid robots , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[18]  Solly Brown,et al.  A Relational Approach to Tool-Use Learning in Robots , 2012, ILP.

[19]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Jan Peters,et al.  Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.

[21]  Sandra Hirche,et al.  Feedback motion planning and learning from demonstration in physical robotic assistance: differences and synergies , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[23]  G. Metta,et al.  Exploring affordances and tool use on the iCub , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[24]  Danica Kragic,et al.  A probabilistic framework for task-oriented grasp stability assessment , 2013, 2013 IEEE International Conference on Robotics and Automation.

[25]  Song-Chun Zhu,et al.  Understanding tools: Task-oriented object modeling, learning and recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[27]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[28]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[29]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[30]  François Osiurak,et al.  Tool use and affordance: Manipulation-based versus reasoning-based approaches. , 2016, Psychological review.

[31]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[32]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[33]  Dieter Fox,et al.  SE3-nets: Learning rigid body motion using deep neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Nima Fazeli,et al.  Empirical evaluation of common contact models for planar impact , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Sergey Levine,et al.  Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[37]  Marc Toussaint,et al.  Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning , 2018, Robotics: Science and Systems.

[38]  Sergey Levine,et al.  Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning , 2018, CoRL.

[39]  Sergey Levine,et al.  Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.

[40]  Danica Kragic,et al.  Global Search with Bernoulli Alternation Kernel for Task-oriented Grasping Informed by Simulation , 2018, CoRL.

[41]  Silvio Savarese,et al.  Learning task-oriented grasping for tool manipulation from simulated self-supervision , 2018, Robotics: Science and Systems.

[42]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Jitendra Malik,et al.  Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Sergey Levine,et al.  Stochastic Adversarial Video Prediction , 2018, ArXiv.

[45]  Marcin Andrychowicz,et al.  Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[47]  Sergey Levine,et al.  Manipulation by Feel: Touch-Based Control with Deep Predictive Models , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[48]  Gregory D. Hager,et al.  Visual Robot Task Planning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[49]  Sergey Levine,et al.  Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.