Automatically Constructing a Lexicon of Verb Phrase Idiomatic Combinations

We investigate the lexical and syntactic flexibility of a class of idiomatic expressions. We develop measures that draw on such linguistic properties, and demonstrate that these statistical, corpus-based measures can be successfully used for distinguishing idiomatic combinations from non-idiomatic ones. We also propose a means for automatically determining which syntactic forms a particular idiom can appear in, and hence should be included in its lexical representation.

[1]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[2]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[3]  R. Moon Fixed Expressions and Idioms in English: A Corpus-Based Approach , 1998 .

[4]  Cristina Cacciari,et al.  The place of idioms in a literal and metaphorical world. , 1993 .

[5]  Timothy Baldwin,et al.  An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.

[6]  Dominic Widdows,et al.  Automatic Extraction of Idioms using Graph Analysis and Asymmetric Lexicosyntactic Patterns , 2005, ACL 2005.

[7]  Afsaneh Fazly,et al.  Automatic Acquisition of Knowledge About Multiword Predicates , 2005, PACLIC.

[8]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[9]  Ido Dagan,et al.  Similarity-Based Estimation of Word Cooccurrence Probabilities , 1994, ACL.

[10]  Aravind K. Joshi,et al.  Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features , 2005, HLT.

[11]  Cristina Cacciari,et al.  Idioms: Processing, Structure, and Interpretation , 1993 .

[12]  I. R. McCaig,et al.  Oxford Dictionary of Current Idiomatic English , 1994 .

[13]  Aline Villavicencio,et al.  Lexical Encoding of MWEs , 2004 .

[14]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[15]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[16]  Andrea C. Schalley,et al.  The ontological loneliness of verb phrase idioms , 2007 .

[17]  Udo Hahn,et al.  Paradigmatic Modifiability Statistics for the Extraction of Complex Multi-Word Terms , 2005, HLT.

[18]  Sally Rice,et al.  Patterns of usage for English SIT, STAND, and LIE: A cognitively-inspired exploration in corpus linguistics , 2004 .

[19]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.

[20]  Paul Pauwels,et al.  Put, set, lay and place : a cognitive linguistic approach to verbal meaning , 2000 .

[21]  SmadjaFrank Retrieving collocations from text , 1993 .

[22]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[23]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[24]  Kenneth Ward Church,et al.  Using Statistics in Lexical Analysis , 2003, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon.