Full-coverage Identification of English Light Verb Constructions

The identification of light verb constructions (LVC) is an important task for several applications. Previous studies focused on some limited set of light verb constructions. Here, we address the full coverage of LVCs. We investigate the performance of different candidate extraction methods on two English full-coverage LVC annotated corpora, where we found that less severe candidate extraction methods should be applied. Then we follow a machine learning approach that makes use of an extended and rich feature set to select LVCs among extracted candidates.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[3]  Suzanne Stevenson,et al.  Statistical Measures of the Semi-Productivity of Light Verb Constructions , 2004 .

[4]  Begoña Sanromán Vilas Towards a semantically oriented selection of the values of Oper: The case of golpe "blow" in Spanish , 2009 .

[5]  Ray Cattell,et al.  ‘Light’ Verbs in English , 1984 .

[6]  Veronika Vincze,et al.  Multiword Expressions and Named Entities in the Wiki50 Corpus , 2011, RANLP.

[7]  Suzanne Stevenson,et al.  Distinguishing Subtypes of Multiword Expressions Using Linguistically-Motivated Statistical Measures , 2007 .

[8]  Dan Roth,et al.  Learning English Light Verb Constructions: Contextual or Statistical , 2011, MWE@ACL.

[9]  Mona T. Diab,et al.  Verb Noun Construction MWE Token Classification , 2009, MWE@IJCNLP.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[12]  Tim van de Cruys,et al.  Semantics-based Multiword Expression Extraction , 2007 .

[13]  Ralph Grishman,et al.  The NomBank Project: An Interim Report , 2004, FCP@NAACL-HLT.

[14]  Ralph Grishman,et al.  Towards Best Practice for Multiword Expressions in Computational Lexicons , 2002, LREC.

[15]  Kate Kearns,et al.  Light verbs in English , 2002 .

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  Colin Bannard A Measure of Syntactic Flexibility for Automatically Identifying Multiword Expressions in Corpora , 2007 .

[18]  Hang Cui,et al.  Extending corpus-based identification of light verb constructions using a supervised learning framework , 2006 .

[19]  Afsaneh Fazly,et al.  Pulling their Weight: Exploiting Syntactic Forms for the Automatic Identification of Idiomatic Expressions in Context , 2007 .

[20]  Iñaki Alegria,et al.  Automatic Extraction of NV Expressions in Basque: Basic Issues on Cooccurrence Techniques , 2011, MWE@ACL.

[21]  Veronika Vincze Light Verb Constructions in the SzegedParalellFX English-Hungarian Parallel Corpus , 2012, LREC.

[22]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[23]  Veronika Vincze,et al.  Domain-Dependent Identification of Multiword Expressions , 2011, RANLP.

[24]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[25]  Tanja Samardžić,et al.  Cross-Lingual Variation of Light Verb Constructions: Using Parallel Corpora and Automatic Alignment for Linguistic Research , 2010 .