Neural Production Systems

Visual environments are structured, consisting of distinct objects or entities. These entities have properties -- both visible and latent -- that determine the manner in which they interact with one another. To partition images into entities, deep-learning researchers have proposed structural inductive biases such as slot-based architectures. To model interactions among entities, equivariant graph neural nets (GNNs) are used, but these are not particularly well suited to the task for two reasons. First, GNNs do not predispose interactions to be sparse, as relationships among independent entities are likely to be. Second, GNNs do not factorize knowledge about interactions in an entity-conditional manner. As an alternative, we take inspiration from cognitive science and resurrect a classic approach, production systems, which consist of a set of rule templates that are applied by binding placeholder variables in the rules to specific entities. Rules are scored on their match to entities, and the best fitting rules are applied to update entity properties. In a series of experiments, we demonstrate that this architecture achieves a flexible, dynamic flow of control and serves to factorize entity-specific and rule-based information. This disentangling of knowledge achieves robust future-state prediction in rich visual environments, outperforming state-of-the-art methods using GNNs, and allows for the extrapolation from simple (few object) environments to more complex environments.

[1]  Danilo Jimenez Rezende,et al.  Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning , 2021, NeurIPS Datasets and Benchmarks.

[2]  Noam M. Shazeer,et al.  Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, J. Mach. Learn. Res..

[3]  Felix Hill,et al.  Object-based attention for spatio-temporal reasoning: Outperforming neuro-symbolic models with flexible distributed architectures , 2020, ArXiv.

[4]  Yoshua Bengio,et al.  Inductive biases for deep learning of higher-level cognition , 2020, Proceedings of the Royal Society A.

[5]  Yee Whye Teh,et al.  Behavior Priors for Efficient Reinforcement Learning , 2020, J. Mach. Learn. Res..

[6]  Alex Lamb,et al.  Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers , 2020, AISTATS.

[7]  Yoshua Bengio,et al.  CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning , 2020, ICLR.

[8]  Yi Ma,et al.  Learning Long-term Visual Dynamics with Region Proposal Interaction Networks , 2020, ICLR.

[9]  Jiajun Wu,et al.  Unsupervised Discovery of 3D Physical Objects from Video , 2020, ICLR.

[10]  Yoshua Bengio,et al.  S2RMs: Spatially Structured Recurrent Modules , 2020, ICLR.

[11]  Pushmeet Kohli,et al.  Strong Generalization and Efficiency in Neural Programs , 2020, ArXiv.

[12]  Yoshua Bengio,et al.  Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems , 2020, ArXiv.

[13]  Thomas Kipf,et al.  Object-Centric Learning with Slot Attention , 2020, NeurIPS.

[14]  Andreas M. Lehrmann,et al.  Unsupervised Video Decomposition using Spatio-temporal Iterative Inference , 2020, ArXiv.

[15]  Alexander S. Ecker,et al.  Unmasking the Inductive Biases of Unsupervised Object Representations for Video Sequences , 2020, ArXiv.

[16]  Daniel Guo,et al.  Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.

[17]  Elise van der Pol,et al.  Contrastive Learning of Structured World Models , 2019, ICLR.

[18]  Demis Hassabis,et al.  Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.

[19]  John D. Co-Reyes,et al.  Entity Abstraction in Visual Model-Based Reinforcement Learning , 2019, CoRL.

[20]  José Hernández-Orallo,et al.  Making sense of sensory input , 2019, Artif. Intell..

[21]  Sergey Levine,et al.  Recurrent Independent Mechanisms , 2019, ICLR.

[22]  Ingmar Posner,et al.  GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations , 2019, ICLR.

[23]  Ignacio Cases,et al.  Routing Networks and the Challenges of Modular and Compositional Computation , 2019, ArXiv.

[24]  Klaus Greff,et al.  Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.

[25]  Sergey Levine,et al.  InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.

[26]  Matthew Botvinick,et al.  MONet: Unsupervised Scene Decomposition and Representation , 2019, ArXiv.

[27]  David Barber,et al.  Modular Networks: Learning to Decompose Neural Computation , 2018, NeurIPS.

[28]  H. Francis Song,et al.  Relational Forward Models for Multi-Agent Learning , 2018, ICLR.

[29]  David Barber,et al.  Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Chris Dyer,et al.  Neural Arithmetic Logic Units , 2018, NeurIPS.

[31]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[32]  Yee Whye Teh,et al.  Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects , 2018, NeurIPS.

[33]  Andrea Vedaldi,et al.  ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking , 2018, ECCV.

[34]  Jürgen Schmidhuber,et al.  Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[35]  Matthew J. Hausknecht,et al.  Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis , 2018, ICLR.

[36]  D. Duvenaud,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[37]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[38]  Matthew Riemer,et al.  Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.

[39]  Silvio Savarese,et al.  Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Yoshua Bengio The Consciousness Prior , 2017, ArXiv.

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[43]  Dawn Xiaodong Song,et al.  Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.

[44]  Razvan Pascanu,et al.  Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[45]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[46]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[47]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[48]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[49]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[50]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[51]  Harri Valpola,et al.  Tagger: Deep Unsupervised Perceptual Grouping , 2016, NIPS.

[52]  Geoffrey E. Hinton,et al.  Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[53]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[54]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[55]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[56]  Dan Klein,et al.  Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[58]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[59]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[60]  Nicolas Le Roux,et al.  Learning a Generative Model of Images by Factoring Appearance and Shape , 2011, Neural Computation.

[61]  John R Anderson,et al.  An integrated theory of the mind. , 2004, Psychological review.

[62]  John R. Anderson,et al.  Rules of the Mind , 1993 .

[63]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[64]  Patrick Gallinari,et al.  A Framework for the Cooperation of Learning Algorithms , 1990, NIPS.

[65]  Geoffrey E. Hinton,et al.  A Distributed Connectionist Production System , 1988, Cogn. Sci..

[66]  A. Newell,et al.  Chunking in Soar: The Anatomy of a General Learning Mechanism , 1986, Machine Learning.

[67]  Benjamin Naumann The Architecture Of Cognition , 2016 .

[68]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[69]  John R. Anderson,et al.  Thinking as a production system , 2005 .

[70]  K. Holyoak,et al.  The Cambridge handbook of thinking and reasoning , 2005 .

[71]  John R. Anderson,et al.  Skill Acquisition : Compilation of Weak-Method Problem Solutions , 2004 .

[72]  G. Marcus The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[73]  Henrik Gollee,et al.  Modular Neural Networks and Self-Decomposition , 1997 .

[74]  Michael C. Mozer,et al.  The Connectionist Scientist Game: Rule Extraction and Refinement in a Neural Network , 1991 .