A Cohesion Graph Based Approach for Unsupervised Recognition of Literal and Non-literal Use of Multiword Expressions

We present a graph-based model for representing the lexical cohesion of a discourse. In the graph structure, vertices correspond to the content words of a text and edges connecting pairs of words encode how closely the words are related semantically. We show that such a structure can be used to distinguish literal and non-literal usages of multi-word expressions.

[1]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[2]  Michael Halliday,et al.  Cohesion in English , 1976 .

[3]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[4]  W. Bruce Croft,et al.  Term clustering of syntactic phrases , 1989, SIGIR '90.

[5]  Adam Kilgarriff Googleology is Bad Science , 2007, Computational Linguistics.

[6]  Kathleen F. McCoy,et al.  Efficient text summarization using lexical chains , 2000, IUI '00.

[7]  Stan Szpakowicz,et al.  Not as Easy as It Seems: Automating the Construction of Lexical Chains Using Roget's Thesaurus , 2003, AI.

[8]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[9]  Simone Paolo Ponzetto,et al.  Knowledge Derived From Wikipedia For Computing Semantic Relatedness , 2007, J. Artif. Intell. Res..

[10]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[11]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[12]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[13]  Christiane Fellbaum,et al.  Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms , 1998 .

[14]  Dawn Archer,et al.  Extracting Multiword Expressions with A Semantic Tagger , 2003, ACL 2003.

[15]  Takuichi Nishimura,et al.  Robust Estimation of Google Counts for Social Network Extraction , 2007, AAAI.

[16]  John A. Carroll,et al.  Applied morphological processing of English , 2001, Natural Language Engineering.

[17]  Eugenie Giesbrecht,et al.  Automatic Identification of Non-Compositional Multi-Word Expressions using Latent Semantic Analysis , 2006 .

[18]  Anoop Sarkar,et al.  A Clustering Approach for Nearly Unsupervised Recognition of Nonliteral Language , 2006, EACL.

[19]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[20]  L. Gerber,et al.  SYSTRAN MT Dictionary Development , 1997, MTSUMMIT.

[21]  Afsaneh Fazly,et al.  Unsupervised Type and Token Identification of Idiomatic Expressions , 2009, CL.

[22]  Takenobu Tokunaga,et al.  Query expansion using heterogeneous thesauri , 2000, Inf. Process. Manag..

[23]  Nina Wacholder,et al.  Toward a Task-based Gold Standard for Evaluation of NP Chunks and Technical Terms , 2003, HLT-NAACL.

[24]  Ronald Rosenfeld,et al.  Improving trigram language modeling with the World Wide Web , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[25]  Graeme Hirst,et al.  Distributional measures of concept-distance: A task-oriented evaluation , 2006, EMNLP.

[26]  Ray Jackendoff,et al.  The Architecture of the Language Faculty , 1996 .

[27]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[28]  Caroline Sporleder,et al.  Unsupervised Recognition of Literal and Non-Literal Use of Idiomatic Expressions , 2009, EACL.

[29]  Peter Mark Roget,et al.  Roget's International Thesaurus , 1977 .

[30]  Afsaneh Fazly,et al.  Pulling their Weight: Exploiting Syntactic Forms for the Automatic Identification of Idiomatic Expressions in Context , 2007 .

[31]  Timothy Baldwin,et al.  Extracting the Unextractable: A Case Study on Verb-particles , 2002, CoNLL.

[32]  Stefan Evert,et al.  Methods for the Qualitative Evaluation of Lexical Association Measures , 2001, ACL.

[33]  Mirella Lapata,et al.  Constructing Semantic Space Models from Parsed Corpora , 2003, ACL.

[34]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.

[35]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[36]  Timothy Baldwin,et al.  Road-testing the English Resource Grammar Over the British National Corpus , 2004, LREC.

[37]  Susanne Z. Riehemann,et al.  A constructional approach to idioms and word formation , 2001 .

[38]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[39]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[40]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.

[41]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[42]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[43]  Francis,et al.  Practical and Efficient Organization of a Large Valency Dictionary ∗ , 1997 .

[44]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[45]  Dekang Lin Using Collocation Statistics in Information Extraction , 1998, MUC.

[46]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[47]  Peter Fankhauser,et al.  WordNet for Lexical Cohesion Analysis , 2004 .

[48]  Shafiq R. Joty,et al.  UofL: Word Sense Disambiguation Using Lexical Cohesion , 2007, SemEval@ACL.

[49]  Yllias Chali,et al.  Text Summarization Using Lexical Chains , 2001 .

[50]  Iryna Gurevych,et al.  Using Wiktionary for Computing Semantic Relatedness , 2008, AAAI.

[51]  John Dunnion,et al.  Comparing Lexical Chain-based Summarisation Approaches Using an Extrinsic Evaluation , 2004 .

[52]  Kathleen F. McCoy,et al.  Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization , 2002, CL.

[53]  Okumura Manabu,et al.  Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion , 1994, COLING.

[54]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.