Memorize or generalize? Searching for a compositional RNN in a haystack

Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems. In this paper, we explore the compositional generalization capabilities of recurrent neural networks (RNNs). We first propose the lookup table composition domain as a simple setup to test compositional behaviour and show that it is theoretically possible for a standard RNN to learn to behave compositionally in this domain when trained with standard gradient descent and provided with additional supervision. We then remove this additional supervision and perform a search over a large number of model initializations to investigate the proportion of RNNs that can still converge to a compositional solution. We discover that a small but non-negligible proportion of RNNs do reach partial compositional solutions even without special architectural constraints. This suggests that a combination of gradient descent and evolutionary strategies directly favouring the minority models that developed more compositional approaches might suffice to lead standard RNNs towards compositional solutions.

[1]  László Dezsö,et al.  Universal Grammar , 1981, Certainty in Action.

[2]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[3]  urgen Schmidhuber Towards Compositional Learning in Dynamic Networks , 1990 .

[4]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[5]  G. Reeke Marvin Minsky, The Society of Mind , 1991, Artif. Intell..

[6]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[9]  Ernest Lepore,et al.  The compositionality papers , 2002 .

[10]  Noam Chomsky,et al.  The faculty of language: what is it, who has it, and how did it evolve? , 2002, Science.

[11]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[12]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[13]  Satinder Singh Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.

[14]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[15]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Joshua B. Tenenbaum,et al.  Probing the Compositionality of Intuitive Functions , 2016, NIPS.

[19]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[20]  Dan Klein,et al.  Learning to Compose Neural Networks for Question Answering , 2016, NAACL.

[21]  Steven Piantadosi,et al.  Compositional Reasoning in Early Childhood , 2016, PloS one.

[22]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Saurabh Kumar,et al.  Learning to Compose Skills , 2017, ArXiv.

[24]  Dan Klein,et al.  Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.

[25]  Trevor Darrell,et al.  Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Marco Baroni,et al.  Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks , 2017, ICLR 2018.

[27]  Eran Yahav,et al.  Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples , 2017, ICML.

[28]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[29]  Sebastian Risi,et al.  Born to Learn: the Inspiration, Progress, and Future of Evolved Plastic Artificial Neural Networks , 2017, Neural Networks.