Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items

In this paper, we attempt to link the inner workings of a neural language model to linguistic theory, focusing on a complex phenomenon well discussed in formal linguis- tics: (negative) polarity items. We briefly discuss the leading hypotheses about the licensing contexts that allow negative polarity items and evaluate to what extent a neural language model has the ability to correctly process a subset of such constructions. We show that the model finds a relation between the licensing context and the negative polarity item and appears to be aware of the scope of this context, which we extract from a parse tree of the sentence. With this research, we hope to pave the way for other studies linking formal linguistics to deep learning.

[1]  Björn-Olav Dozo,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[2]  Dan Klein,et al.  Large-Scale Syntactic Language Modeling with Treelets , 2012, ACL.

[3]  Emmanuel Dupoux,et al.  Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.

[4]  A. Giannakidou,et al.  Negative and Positive Polarity Items: Variation, Licensing, and Compositionality , 2008 .

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  William A. Ladusaw Polarity sensitivity as inherent scope relations , 1980 .

[7]  Z. Vendler Linguistics in Philosophy , 1967 .

[8]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[9]  Christof Monz,et al.  The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.

[10]  J. R.,et al.  Quantitative analysis , 1892, Nature.

[11]  Alexander Clark,et al.  Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge , 2017, Cogn. Sci..

[12]  P. Pochet A Quantitative Analysis , 2006 .

[13]  Jacob Hoeksema,et al.  On the natural history of negative polarity items , 2012 .

[14]  Willem H. Zuidema,et al.  Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..

[15]  Edouard Grave,et al.  Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.

[16]  Chris Barker,et al.  Negative polarity as scope marking , 2018 .