Is "my favorite new movie" my favorite movie? Probing the Understanding of Recursive Noun Phrases

Recursive noun phrases (NPs) have interesting semantic properties. For example, my favorite new movie is not necessarily my favorite movie, whereas my new favorite movie is. This is common sense to humans, yet it is unknown whether language models have such knowledge. We introduce the Recursive Noun Phrase Challenge (RNPC), a challenge set targeting the understanding of recursive NPs. When evaluated on our dataset, stateof-the-art Transformer models only achieve around chance performance. Still, we show that such knowledge is learnable with appropriate data. We further probe the models for relevant linguistic features that can be learned from our tasks, including modifier semantic category and modifier scope. Finally, models trained on RNPC achieve strong zero-shot performance on an extrinsic Harm Detection task, showing the usefulness of the understanding of recursive NPs in downstream applications.1

[1]  Alex Wang,et al.  What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.

[2]  Pierrette Bouillon,et al.  The Description of Adjectives for Natural Language Processing: Theoretical and Applied Perspectives , 1999 .

[3]  Kenneth Ward Church,et al.  Using Web-scale N-grams to Improve Base NP Parsing Performance , 2010, COLING.

[4]  Preslav Nakov,et al.  SemEval-2013 Task 4: Free Paraphrases of Noun Compounds , 2013, SemEval@NAACL-HLT.

[5]  Nanyun Peng,et al.  Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects , 2019, EMNLP/IJCNLP.

[6]  John Hewitt,et al.  Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.

[7]  Barbara H. Partee,et al.  Privative Adjectives: Subsective plus Coercion , 2003 .

[8]  Christopher D. Manning,et al.  A Dictionary of Nonsubsective Adjectives , 2014 .

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Roy Schwartz,et al.  Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets , 2019, NAACL.

[11]  Richard Campbell Computation of Modifier Scope in NP by a Language-neutral Method , 2002, COLING.

[12]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[13]  Yonatan Belinkov,et al.  Interpretability and Analysis in Neural NLP , 2020, ACL.

[14]  H. Kamp,et al.  Prototype theory and compositionality , 1995, Cognition.

[15]  Michele Banko,et al.  A Unified Taxonomy of Harmful Content , 2020, ALW.

[16]  Jackie Chi Kit Cheung,et al.  ADEPT: An Adjective-Dependent Plausibility Task , 2021, ACL.

[17]  Raffaella Bernardi,et al.  Entailment above the word level in distributional semantics , 2012, EACL.

[18]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[19]  J.F.A.K. van Benthem Determiners and logic , 1983 .

[20]  Yonatan Bisk,et al.  Natural Language Inference from Multiple Premises , 2017, IJCNLP.

[21]  Philipp Cimiano,et al.  Modelling the Semantics of Adjectives in the Ontology-Lexicon Interface , 2014, CogALex@COLING.

[22]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[23]  James R. Curran,et al.  Adding Noun Phrase Structure to the Penn Treebank , 2007, ACL.

[24]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[25]  Chris Callison-Burch,et al.  So-Called Non-Subsective Adjectives , 2016, *SEM@ACL.

[26]  Vered Shwartz,et al.  Olive Oil is Made of Olives, Baby Oil is Made for Babies: Interpreting Noun Compounds using Paraphrases in a Neural Model , 2018, NAACL-HLT.

[27]  Christian S. Jensen,et al.  Modification , 1995, The TSQL2 Temporal Query Language.

[28]  Louise McNally,et al.  Intensionality was only alleged: On adjective-noun composition in distributional semantics , 2013, IWCS.

[29]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[30]  Richard A. Frost,et al.  Adjectives: A Uniform Semantic Approach , 2005, Canadian AI.

[31]  Distinguishing Intersective and Non-Intersective Adjectives in Compositional Distributional Semantics , 2015 .

[32]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[33]  Gennaro Chierchia,et al.  Meaning and Grammar: An Introduction to Semantics , 1990 .

[34]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[35]  Chris Callison-Burch,et al.  Most "babies" are "little" and most "problems" are "huge": Compositional Entailment in Adjective-Nouns , 2016, ACL.

[36]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[37]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Preslav Nakov,et al.  Search Engine Statistics Beyond the n-Gram: Application to Noun Compound Bracketing , 2005, CoNLL.

[40]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[41]  Edward H. Matthei,et al.  The acquisition of prenominal modifier sequences , 1982, Cognition.

[42]  Mark Lauer,et al.  Corpus Statistics Meet the Noun Compound: Some Empirical Results , 1995, ACL.

[43]  Uwe Reyle,et al.  Presuppositions and Discourse: Essays Offered to Hans Kamp , 2010 .

[44]  Henry Hamburger,et al.  Acquisition of cognitive compiling , 1984, Cognition.

[45]  Aina Garí Soler,et al.  ALL Dolphins Are Intelligent and SOME Are Friendly: Probing BERT for Nouns’ Semantic Properties and their Prototypicality , 2021, BLACKBOXNLP.