Compositional Generalization via Neural-Symbolic Stack Machines

Despite achieving tremendous success, existing deep learning models have exposed limitations in compositional generalization, the capability to learn compositional rules and apply them to unseen cases in a systematic manner. To tackle this issue, we propose the Neural-Symbolic Stack Machine (NeSS). It contains a neural network to generate traces, which are then executed by a symbolic stack machine enhanced with sequence manipulation operations. NeSS combines the expressive power of neural sequence models with the recursion supported by the symbolic stack machine. Without training supervision on execution traces, NeSS achieves 100% generalization performance in four domains: the SCAN benchmark of language-driven navigation tasks, the task of few-shot learning of compositional instructions, the compositional machine translation benchmark, and context-free grammar parsing tasks.

[1]  Brenden M. Lake,et al.  Compositional generalization through meta sequence-to-sequence learning , 2019, NeurIPS.

[2]  Chong Wang,et al.  Neural Logic Machines , 2019, ICLR.

[3]  Yoshua Bengio,et al.  CLOSURE: Assessing Systematic Generalization of CLEVR Models , 2019, ViGIL@NeurIPS.

[4]  Marcus Rohrbach,et al.  Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering , 2019, ICML.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Wojciech Zaremba,et al.  Learning Simple Algorithms from Examples , 2015, ICML.

[7]  Rens Bod,et al.  An All-Subtrees Approach to Unsupervised Parsing , 2006, ACL.

[8]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[9]  Yoav Artzi,et al.  Neural Shift-Reduce CCG Semantic Parsing , 2016, EMNLP.

[10]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[11]  Yue Zhang,et al.  Shift-Reduce Constituent Parsing with Neural Lookahead Features , 2016, TACL.

[12]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[15]  Marcin Andrychowicz,et al.  Neural Random Access Machines , 2015, ERCIM News.

[16]  Marco Baroni,et al.  Human few-shot learning of compositional instructions , 2019, CogSci.

[17]  Chuang Gan,et al.  Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.

[18]  Yoshua Bengio,et al.  Compositional generalization in a deep seq2seq model by separating syntax and semantics , 2019, ArXiv.

[19]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[20]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[21]  Jacob Andreas,et al.  Good-Enough Compositional Data Augmentation , 2019, ACL.

[22]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[23]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[24]  Xiao Wang,et al.  Measuring Compositional Generalization: A Comprehensive Method on Realistic Data , 2019, ICLR.

[25]  Desmond Elliott,et al.  Compositional Generalization in Image Captioning , 2019, CoNLL.

[26]  Jiajun Wu,et al.  Neurally-Guided Structure Inference , 2019, ICML.

[27]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[28]  László Dezsö,et al.  Universal Grammar , 1981, Certainty in Action.

[29]  Christopher D. Manning,et al.  Compositional Attention Networks for Machine Reasoning , 2018, ICLR.

[30]  A Benchmark for Systematic Generalization in Grounded Language Understanding , 2020, NeurIPS.

[31]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[34]  Colin de la Higuera,et al.  Grammatical Inference: Learning Automata and Grammars , 2010 .

[35]  Da Xiao,et al.  Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction , 2018, ICLR.

[36]  Nando de Freitas,et al.  Learning Compositional Neural Programs with Recursive Tree Search and Planning , 2019, NeurIPS.

[37]  James Cross,et al.  Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles , 2016, EMNLP.

[38]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[39]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[40]  Thomas L. Griffiths,et al.  Automatically Composing Representation Transformations as a Means for Generalization , 2018, ICLR.

[41]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[42]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[43]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[44]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[45]  Marco Baroni,et al.  Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks , 2018, BlackboxNLP@EMNLP.

[46]  David Lopez-Paz,et al.  Permutation Equivariant Models for Compositional Generalization in Language , 2020, ICLR.

[47]  Chuang Gan,et al.  The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.

[48]  Dawn Xiaodong Song,et al.  Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.

[49]  Armando Solar-Lezama,et al.  Learning Compositional Rules via Neural Program Synthesis , 2020, NeurIPS.

[50]  Dawn Xiaodong Song,et al.  Towards Synthesizing Complex Programs From Input-Output Examples , 2017, ICLR.

[51]  Marco Baroni,et al.  CNNs found to jump around more skillfully than RNNs: Compositional Generalization in Seq2seq Convolutional Networks , 2019, ACL.

[52]  Liang Zhao,et al.  Compositional Generalization for Primitive Substitutions , 2019, EMNLP.