Structured agents for physical construction

Physical construction---the ability to compose objects, subject to physical dynamics, to serve some function---is fundamental to human intelligence. We introduce a suite of challenging physical construction tasks inspired by how children play with blocks, such as matching a target configuration, stacking blocks to connect objects together, and creating shelter-like structures over target objects. We examine how a range of deep reinforcement learning agents fare on these challenges, and introduce several new approaches which provide superior performance. Our results show that agents which use structured representations (e.g., objects and scene graphs) and structured policies (e.g., object-centric actions) outperform those which use less structured representations, and generalize better beyond their training when asked to reason about larger scenes. Model-based agents which use Monte-Carlo Tree Search also outperform strictly model-free agents in our most challenging construction problems. We conclude that approaches which combine structured representations and reasoning with powerful learning are a key path toward agents that possess rich intuitive physics, scene understanding, and planning.

[1]  Richard S. Sutton,et al.  Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[2]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[3]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[4]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[5]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[6]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[7]  Pascal Schreck,et al.  Geometric Construction Problem Solving in Computer-Aided Learning , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[8]  Sanja Fidler,et al.  NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.

[9]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[10]  D. Plaut,et al.  Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. , 2004, Psychological review.

[11]  Jochen Pfalzgraf,et al.  On geometric and topological reasoning in robotics , 1997, Annals of Mathematics and Artificial Intelligence.

[12]  Marc G. Bellemare,et al.  Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.

[13]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[14]  Honglak Lee,et al.  Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.

[15]  Jessica B. Hamrick,et al.  Relational inductive bias for physical construction in humans and machines , 2018, CogSci.

[16]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17]  Ali Farhadi,et al.  Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[19]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[20]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[21]  Rajesh P. N. Rao,et al.  Embodiment is the foundation, not a level , 1996, Behavioral and Brain Sciences.

[22]  Jürgen Schmidhuber,et al.  Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[23]  Mario Fritz,et al.  To Fall Or Not To Fall: A Visual Approach to Physical Stability Prediction , 2016, ArXiv.

[24]  Bernt Schiele,et al.  Long-Term Image Boundary Prediction , 2016, AAAI.

[25]  Sergey Levine,et al.  Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.

[26]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[27]  Christoph M. Hoffmann,et al.  Geometric constraint solver , 1995, Comput. Aided Des..

[28]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[29]  Sergey Levine,et al.  Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[30]  Mario Fritz,et al.  Visual Stability Prediction and Its Application to Manipulation , 2016, AAAI Spring Symposia.

[31]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[32]  Rob Fergus,et al.  Composable Planning with Attributes , 2018, ICML.

[33]  Marc Toussaint,et al.  Logic-Geometric Programming: An Optimization-Based Approach to Combined Task and Motion Planning , 2015, IJCAI.

[34]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[35]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[36]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[37]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[38]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[39]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[40]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[41]  Jiajun Wu,et al.  Learning to See Physics via Visual De-animation , 2017, NIPS.

[42]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[43]  Razvan Pascanu,et al.  Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.

[44]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[45]  Martin A. Riedmiller,et al.  Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.

[46]  Dennis S. Arnon,et al.  Geometric Reasoning with Logic and Algebra , 1988, Artif. Intell..

[47]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[48]  Max Welling,et al.  Attention Solves Your TSP , 2018, ArXiv.

[49]  Razvan Pascanu,et al.  Deep reinforcement learning with relational inductive biases , 2018, ICLR.

[50]  David Wingate,et al.  A Physics-Based Model Prior for Object-Oriented MDPs , 2014, ICML.

[51]  S. Chou Mechanical Geometry Theorem Proving , 1987 .

[52]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[53]  Oriol Vinyals,et al.  Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.

[54]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[55]  Jiajun Wu,et al.  Physics 101: Learning Physical Object Properties from Unlabeled Videos , 2016, BMVC.

[56]  Andrea Vedaldi,et al.  ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking , 2018, ECCV.

[57]  Razvan Pascanu,et al.  Learning model-based planning from scratch , 2017, ArXiv.

[58]  Weitang Liu,et al.  Surprising Negative Results for Generative Adversarial Tree Search , 2018, 1806.05780.