Deep Affordance Foresight: Planning Through What Can Be Done in the Future

Planning in realistic environments requires searching in large planning spaces. Affordances are a powerful concept to simplify this search, because they model what actions can be successful in a given situation. However, the classical notion of affordance is not suitable for long horizon planning because it only informs the robot about the immediate outcome of actions instead of what actions are best for achieving a long-term goal. In this paper, we introduce a new affordance representation that enables the robot to reason about the longterm effects of actions through modeling what actions are afforded in the future. Based on the new representation, we develop a learning-to-plan method, Deep Affordance Foresight (DAF), that learns partial environment models of affordances of parameterized motor skills through trial-and-error. We evaluate DAF on two challenging manipulation domains and show that it can effectively learn to carry out multi-step tasks, share learned affordance representations among different tasks, and learn to plan with high-dimensional image inputs.

[1]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[2]  Lydia E. Kavraki,et al.  Learning Feasibility for Task and Motion Planning in Tabletop Environments , 2019, IEEE Robotics and Automation Letters.

[3]  Kristen Grauman,et al.  Dexterous Robotic Grasping with Object-Centric Visual Affordances , 2020, ArXiv.

[4]  Danica Kragic,et al.  Learning task constraints for robot grasping using graphical models , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Scott Kuindersma,et al.  Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..

[6]  Jung-Su Ha,et al.  Deep Visual Reasoning: Learning to Predict Action Sequences for Task and Motion Planning from an Initial Scene Image , 2020, Robotics: Science and Systems.

[7]  Fabio Viola,et al.  Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.

[8]  Peter K. Allen,et al.  Semantic grasping: Planning robotic grasps functionally suitable for an object manipulation task , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Leslie Pack Kaelbling,et al.  Pre-image Backchaining in Belief Space for Mobile Manipulation , 2011, ISRR.

[10]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[11]  L. P. Kaelbling,et al.  Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..

[12]  George Konidaris,et al.  Learning Symbolic Representations for Planning with Parameterized Skills , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[14]  Leslie Pack Kaelbling,et al.  Active Model Learning and Diverse Action Sampling for Task and Motion Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Oliver Kroemer,et al.  Learning grasp affordance densities , 2011, Paladyn J. Behav. Robotics.

[16]  Leslie Pack Kaelbling,et al.  Constructing Symbolic Representations for High-Level Planning , 2014, AAAI.

[17]  Kristen Grauman,et al.  Learning Affordance Landscapes for Interaction Exploration in 3D Environments , 2020, NeurIPS.

[18]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Leslie Pack Kaelbling,et al.  Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.

[20]  Marc Toussaint,et al.  Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning , 2018, Robotics: Science and Systems.

[21]  George Konidaris,et al.  Learning Portable Representations for High-Level Planning , 2019, ICML.

[22]  Leslie Pack Kaelbling,et al.  Integrated Task and Motion Planning , 2020, Annu. Rev. Control. Robotics Auton. Syst..

[23]  Sachin Chitta,et al.  MoveIt! [ROS Topics] , 2012, IEEE Robotics Autom. Mag..

[24]  Marc Toussaint,et al.  Logic-Geometric Programming: An Optimization-Based Approach to Combined Task and Motion Planning , 2015, IJCAI.

[25]  Sergey Levine,et al.  Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.

[26]  Steven M. LaValle,et al.  RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[27]  Silvio Savarese,et al.  Learning task-oriented grasping for tool manipulation from simulated self-supervision , 2018, Robotics: Science and Systems.

[28]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[29]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[30]  Stefanie Tellex,et al.  Goal-Based Action Priors , 2015, ICAPS.

[31]  Leslie Pack Kaelbling,et al.  From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning , 2018, J. Artif. Intell. Res..

[32]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[34]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[35]  Leslie Pack Kaelbling,et al.  Learning composable models of parameterized skills , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Bruno Castro da Silva,et al.  Active Learning of Parameterized Skills , 2014, ICML.

[37]  Leslie Pack Kaelbling,et al.  Integrated task and motion planning in belief space , 2013, Int. J. Robotics Res..

[38]  Reuven Y. Rubinstein,et al.  Optimization of computer simulation models with rare events , 1997 .

[39]  Victoria Xia,et al.  Learning sparse relational transition models , 2018, ICLR.

[40]  三嶋 博之 The theory of affordances , 2008 .

[41]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Maya Cakmak,et al.  The learning and use of traversability affordance using range images on a mobile robot , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[43]  Leslie Pack Kaelbling,et al.  PDDLStream: Integrating Symbolic Planners and Blackbox Samplers via Optimistic Adaptive Planning , 2020, ICAPS.

[44]  Jung-Su Ha,et al.  Deep Visual Heuristics: Learning Feasibility of Mixed-Integer Programs for Manipulation Planning , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[45]  Erik Talvitie,et al.  Simple Local Models for Complex Dynamical Systems , 2008, NIPS.

[46]  Stefan Wermter,et al.  Training Agents With Interactive Reinforcement Learning and Contextual Affordances , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[47]  Misha Denil,et al.  Learning Awareness Models , 2018, ICLR.

[48]  Leslie Pack Kaelbling,et al.  Symbol Acquisition for Probabilistic High-Level Planning , 2015, IJCAI.

[49]  Doina Precup,et al.  What can I do here? A Theory of Affordances in Reinforcement Learning , 2020, ICML.

[50]  Gabriel Barth-Maron,et al.  Toward Affordance-Aware Planning , 2010 .

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Alberto Rodriguez,et al.  Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[53]  Satinder Singh,et al.  Value Prediction Network , 2017, NIPS.