Out-of-the box neural networks can support combinatorial generalization

Combinatorial generalization - the ability to understand and produce novel combinations of already familiar elements - is considered to be a core capacity of the human mind and a major challenge to neural network models. A significant body of research suggests that conventional neural networks can't solve this problem unless they are endowed with mechanisms specifically engineered for the purpose of representing symbols. In this paper we introduce a novel way of representing symbolic structures in connectionist terms - the vectors approach to representing symbols (VARS), which allows training standard neural architectures to encode symbolic knowledge explicitly at their output layers. In two simulations , we show that out-of-the-box neural networks not only can learn to produce VARS representations, but in doing so they achieve combinatorial generalization. This adds to other recent work that has shown improved combinatorial generalization under specific training conditions, and raises the question of whether special mechanisms are indeed needed to support symbolic processing.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Jürgen Schmidhuber,et al.  Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[3]  K. Holyoak,et al.  A symbolic-connectionist theory of relational inference and generalization. , 2003, Psychological review.

[4]  B. Wyble,et al.  The simultaneous type, serial token model of temporal attention and working memory. , 2007, Psychological review.

[5]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[6]  I. Biederman,et al.  Dynamic binding in a neural network for shape recognition. , 1992, Psychological review.

[7]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[8]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[9]  James L. McClelland,et al.  Connectionist models of cognition. , 2008 .

[10]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[11]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Frank van der Velde,et al.  Lack of combinatorial productivity in language processing with simple recurrent networks , 2004, Connect. Sci..

[14]  John E. Hummel,et al.  Getting symbols out of a neural architecture , 2011, Connect. Sci..

[15]  Jonathan D. Cohen,et al.  Indirection and symbol-like processing in the prefrontal cortex and basal ganglia , 2013, Proceedings of the National Academy of Sciences.

[16]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[17]  Markus F. Damian,et al.  A fundamental limitation of the conjunctive codes learned in PDP models of cognition: comment on Botvinick and Plaut (2006). , 2009, Psychological review.

[18]  T. Shallice,et al.  Hierarchical schemas and goals in the control of sequential behavior. , 2006, Psychological review.

[19]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[20]  Alexander A. Petrov,et al.  Integration of Memory and Reasoning in Analogy-Making: The AMBR Model , 2000 .

[21]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[22]  G. Marcus Rethinking Eliminative Connectionism , 1998, Cognitive Psychology.

[23]  James L. McClelland,et al.  Letting structure emerge: connectionist and dynamical systems approaches to cognition , 2010, Trends in Cognitive Sciences.

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Garrett Swan,et al.  The binding pool: A model of shared neural resources for distinct items in visual working memory , 2014, Attention, perception & psychophysics.

[28]  Colin J Davis,et al.  The spatial coding model of visual word identification. , 2010, Psychological review.

[29]  F. van der Velde,et al.  Neural blackboard architectures of combinatorial structures in cognition , 2006, Behavioral and Brain Sciences.

[30]  John E. Hummel,et al.  The Proper Treatment of Symbols in a Connectionist Architecture , 2000 .

[31]  Randall C. O'Reilly,et al.  Generalization in Interactive Networks: The Benefits of Inhibitory Competition and Hebbian Learning , 2001, Neural Computation.

[32]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[33]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[34]  J. Bowers Parallel Distributed Processing Theory in the Age of Deep Networks , 2017, Trends in Cognitive Sciences.