Compositionality as Directional Consistency in Sequential Neural Networks

Sequential neural networks have shown success on a variety of natural language tasks, but through what internal mechanisms they achieve systematic compositionality crucial to language understanding is still an open question. In particular, gated networks such as Gated Recurrent Units (GRUs) are known to significantly outperform Simple Recurrent Neural Networks (SRNs). We conduct an exploratory study comparing the abilities of SRNs and GRUs to make compositional generalizations, using adjective semantics as testing ground. Our results demonstrate that GRUs generalize more systematically than SRNs. On analyzing the learned representations, we find that GRUs encode the compositional contribution of adjectives as directionally consistent linear displacements. This consistency correlates with generalization accuracy within GRUs, suggesting that it is an effective strategy for deriving more compositionally generalizable representations.

[1]  Dennis Ulmer,et al.  On the Realization of Compositionality in Neural Networks , 2019, BlackboxNLP@ACL.

[2]  B. Partee Lexical semantics and compositionality. , 1995 .

[3]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[4]  Chris Callison-Burch,et al.  Most "babies" are "little" and most "problems" are "huge": Compositional Entailment in Adjective-Nouns , 2016, ACL.

[5]  Alex Wang,et al.  Probing What Different NLP Tasks Teach Machines about Function Word Comprehension , 2019, *SEMEVAL.

[6]  Carolyn Penstein Rosé,et al.  Stress Test Evaluation for Natural Language Inference , 2018, COLING.

[7]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[8]  Yoav Goldberg,et al.  Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages , 2019, NAACL.

[9]  Edouard Grave,et al.  Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.

[10]  Brenden M. Lake,et al.  Compositional generalization through meta sequence-to-sequence learning , 2019, NeurIPS.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Tal Linzen,et al.  Targeted Syntactic Evaluation of Language Models , 2018, EMNLP.

[13]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[14]  Yonatan Belinkov,et al.  Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.

[15]  Rachel Rudinger,et al.  Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation , 2018, BlackboxNLP@EMNLP.

[16]  Christopher Potts,et al.  Recursive Neural Networks Can Learn Logical Semantics , 2014, CVSC.

[17]  R. Thomas McCoy,et al.  Discovering the Compositional Structure of Vector Representations with Role Learning Networks , 2019, BLACKBOXNLP.

[18]  Mathijs Mul,et al.  Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization , 2019, ArXiv.

[19]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[20]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[21]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[22]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[23]  Christopher D. Manning,et al.  A Dictionary of Nonsubsective Adjectives , 2014 .

[24]  Elia Bruni,et al.  Transcoding Compositionally: Using Attention to Find More Generalizable Solutions , 2019, BlackboxNLP@ACL.

[25]  Jacob Andreas,et al.  Measuring Compositionality in Representation Learning , 2019, ICLR.

[26]  Richard Evans,et al.  Can Neural Networks Understand Logical Entailment? , 2018, ICLR.

[27]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[28]  Marco Baroni,et al.  The emergence of number and syntax units in LSTM language models , 2019, NAACL.

[29]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[30]  Eric Fosler-Lussier,et al.  Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance , 2019, Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for.

[31]  Ewan Dunbar,et al.  RNNs Implicitly Implement Tensor Product Representations , 2018, ICLR.

[32]  Yonatan Belinkov,et al.  Proceedings of the 2018 EMNLP Workshop BlackboxNLP : Analyzing and Interpreting Neural Networks for NLP , 2018 .

[33]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.