Automatically learning semantic knowledge about multiword predicates

Highly frequent and highly polysemous verbs, such as give, take, and make, pose a challenge to automatic lexical acquisition methods. These verbs widely participate in multiword predicates (such as light verb constructions, or LVCs), in which they contribute a broad range of figurative meanings that must be recognized. Here we focus on two properties that are key to the computational treatment of LVCs. First, we consider the degree of figurativeness of the semantic contribution of such a verb to the various LVCs it participates in. Second, we explore the patterns of acceptability of LVCs, and their productivity over semantically related combinations. To assess these properties, we develop statistical measures of figurativeness and acceptability that draw on linguistic properties of LVCs. We demonstrate that these corpus-based measures correlate well with human judgments of the relevant property. We also use the acceptability measure to estimate the degree to which a semantic class of nouns can productively form LVCs with a given verb. The linguistically-motivated measures outperform a standard measure for capturing the strength of collocation of these multiword expressions.

[1]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[2]  Laurel J. Brinton,et al.  Collocational and idiomatic aspects of composite predicates in the history of English , 1999 .

[3]  Cristina Cacciari,et al.  The place of idioms in a literal and metaphorical world. , 1993 .

[4]  Frank Keller,et al.  Using the Web to Obtain Frequencies for Unseen Bigrams , 2003, CL.

[5]  Leo Wanner Towards automatic fine-grained semantic classification of verb-noun collocations , 2004, Nat. Lang. Eng..

[6]  Timothy Baldwin,et al.  Disambiguating Japanese compound verbs , 2005, Comput. Speech Lang..

[7]  Miriam Butt The Light Verb Jungle , 2003 .

[8]  Mark Dras,et al.  Death and Lightness: Using a Demographic Model to Find Support Verbs , 1996, ArXiv.

[9]  Begoña Villada Moirón,et al.  Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions , 2004, LREC.

[10]  Eric Wehrli,et al.  Extraction of multi-word collocations using syntactic bigram composition , 2003 .

[11]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[12]  Simone Teufel,et al.  Corpus-based Method for Automatic Identification of Support Verbs for Nominalizations , 1995, EACL.

[13]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[14]  Afsaneh Fazly,et al.  AUTOMATIC ACQUISITION OF LEXICAL KNOWLEDGE ABOUT , 2007 .

[15]  A. Wierzbicka 5. Why can you have a drink when you can't *have an eat ? , 1982 .

[16]  Aravind K. Joshi,et al.  Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features , 2005, HLT.

[17]  Graeme Hirst,et al.  Determining Word Sense Dominance Using a Thesaurus , 2006, EACL.

[18]  Udo Hahn,et al.  Paradigmatic Modifiability Statistics for the Extraction of Complex Multi-Word Terms , 2005, HLT.

[19]  Tzong-hong Lin,et al.  Light verb syntax and the theory of phrase structure , 2001 .

[20]  Timothy Baldwin,et al.  Extracting the Unextractable: A Case Study on Verb-particles , 2002, CoNLL.

[21]  John Newman,et al.  Give : a cognitive linguistic study , 1996 .

[22]  Josep Alba-Salas,et al.  Light verb constructions in Romance : a syntactic analysis , 2002 .

[23]  Claudia Claridge,et al.  MULTI-WORD VERBS IN EARLY MODERN ENGLISH. A Corpus-based Study , 2000 .

[24]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[25]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[26]  Graeme Hirst,et al.  Building a lexical knowledge-base of near-synonym differences , 2004 .

[27]  I. Dan Melamed Automatic Discovery of Non-Compositional Compounds in Parallel Data , 1997, EMNLP.

[28]  Kate Kearns,et al.  Light verbs in English , 2002 .

[29]  R. Gibbs,et al.  Psycholinguistic studies on the syntactic behavior of idioms , 1989, Cognitive Psychology.

[30]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[31]  Suzanne Stevenson,et al.  Statistical Measures of the Semi-Productivity of Light Verb Constructions , 2004 .

[32]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[33]  G. Lakoff,et al.  Metaphors We Live by , 1981 .

[34]  R. Moon Fixed Expressions and Idioms in English: A Corpus-Based Approach , 1998 .

[35]  Afsaneh Fazly,et al.  Automatically Determining Allowable Combinations of a Class of Flexible Multiword Expressions , 2006, CICLing.

[36]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[37]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.

[38]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[39]  Paul Pauwels,et al.  Put, set, lay and place : a cognitive linguistic approach to verbal meaning , 2000 .

[40]  Kenneth Ward Church,et al.  Using Statistics in Lexical Analysis , 2003, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon.

[41]  Tadao Miyamoto The light verb construction in Japanese , 1999 .

[42]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[43]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[44]  Timothy Baldwin,et al.  An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.

[45]  Afsaneh Fazly,et al.  Automatically Constructing a Lexicon of Verb Phrase Idiomatic Combinations , 2006, EACL.

[46]  R. Wallace The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason , 1988 .

[47]  S. Evert,et al.  Can we do better than frequency ? A case study on extracting PP-verb collocations , 2001 .

[48]  Ray Cattell,et al.  ‘Light’ Verbs in English , 1984 .

[49]  Aline Villavicencio,et al.  The availability of verb-particle constructions in lexical resources: How much is enough? , 2005, Comput. Speech Lang..

[50]  Sally Rice,et al.  Patterns of usage for English SIT, STAND, and LIE: A cognitively-inspired exploration in corpus linguistics , 2004 .

[51]  Afsaneh Fazly,et al.  Automatically Distinguishing Literal and Figurative Usages of Highly Polysemous Verbs , 2005, ACL 2005.

[52]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[53]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[54]  Aline Villavicencio Verb-Particle Constructions and Lexical Resources , 2003, ACL 2003.