Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties

This volume contains the papers accepted for presentation at the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties. The workshop is endorsed by the Association for Computational Linguistics Special Interest Group on the Lexicon (SIGLEX) and is hosted in conjunction with the COLING/ACL 2006 on July 23rd, 2006 in Sydney, Australia. There has been a growing awareness in the NLP community of the problems that multiword expressions (MWEs) pose. Developments in areas such as machine translation, text summarization, paraphrasing, grammar development and parsing, information retrieval, and question answering (to mention a few) have acknowledged difficulties due to the idiosyncratic nature of multiword expressions. This workshop continues a tradition of ACL workshops on Collocations (2001) and Multiword Expressions (2003 and 2004). Its specific objective is to focus on the underlying properties of MWEs. The call for papers expressed our interest in several topics such as the definition of MWEs, properties of MWEs and their impact on NLP applications, representation and treatment of the different classes of MWEs, linguistic and psycholinguistic analyses of MWEs, evaluation of extraction techniques and the importance of (non-)compositionality. We received 23 submissions in total. Each submission was reviewed by (at least) three members of the program committee who not only judged each submission but also gave detailed comments to the authors. Among the received papers, 10 were selected for presentation at the workshop. After 3 papers have been withdrawn by their authors, seven papers are included in these proceedings. The intention of this workshop is to focus on some fundamental questions on the nature of MWEs. To do this we will allow plenty of time for discussion to pursue some of the interesting, open and difficult questions that MWEs raise. As well as a discussion period after each session of papers, we will be organising group discussions at the end of the workshop. These will focus on problems of defining, characterising and evaluating MWEs, given what we know about the range of phenomena that they encompass as well as any important questions that have arisen during the workshop.

[1]  Jussi Piitulainen,et al.  Idiomatic Object Usage and Support Verbs , 1998, COLING-ACL.

[2]  Colin J. Bannard,et al.  Learning about the meaning of verb-particle constructions from corpora , 2005, Comput. Speech Lang..

[3]  Lucy Vanderwende,et al.  Algorithm for Automatic Interpretation of Noun Sequences , 1994, COLING.

[4]  Dawn Archer,et al.  Evaluating Lexical Resources for a Semantic Tagger , 2004, LREC.

[5]  Frank Keller,et al.  Using the Web to Obtain Frequencies for Unseen Bigrams , 2003, CL.

[6]  Ralph Grishman,et al.  NOMLEX: a lexicon of nominalizations , 1998 .

[7]  Gosse Bouma,et al.  Corpus-based Acquisition of Collocational Prepositional Phrases , 2001, CLIN.

[8]  Yi Zhang,et al.  Automated Deep Lexical Acquisition for Robust Open Texts Processing , 2006, LREC.

[9]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[10]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11]  Gertjan van Noord,et al.  Alpino: Wide-coverage Computational Analysis of Dutch , 2000, CLIN.

[12]  Timothy Baldwin,et al.  Disambiguating Japanese compound verbs , 2005, Comput. Speech Lang..

[13]  Timothy Baldwin,et al.  An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.

[14]  John A. Carroll,et al.  Applied morphological processing of English , 2001, Natural Language Engineering.

[15]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[16]  Susan Jean Lindner A lexico-semantic analysis of English verb particle constructions with out and up , 1981 .

[17]  Gertjan van Noord Error Mining for Wide-Coverage Grammar Engineering , 2004, ACL.

[18]  Aravind K. Joshi,et al.  Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features , 2005, HLT.

[19]  Jörg Tiedemann,et al.  Identifying idiomatic expressions using automatic word-alignment , 2006 .

[20]  Paul Rayson,et al.  A semantic tagger for the Finnish language , 2005 .

[21]  J. Kenney,et al.  Mathematics of statistics , 1940 .

[22]  Susi Wurmbrand,et al.  THE STRUCTURE(S) OF PARTICLE VERBS , 2000 .

[23]  Miriam Butt,et al.  Light verbs in Urdu and grammaticalization , 2003 .

[24]  Ray Jackendoff TWISTIN' THE NIGHT AWAY , 1997 .

[25]  I. Dan Melamed Automatic Discovery of Non-Compositional Compounds in Parallel Data , 1997, EMNLP.

[26]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[27]  Timothy Baldwin,et al.  Road-testing the English Resource Grammar Over the British National Corpus , 2004, LREC.

[28]  Anthony McEnery,et al.  A large semantic lexicon for corpus annotation. , 2005 .

[29]  Manindra K. Verma Complex predicates in South Asian languages , 1993 .

[30]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[31]  Ami Schattner,et al.  Pain in the neck , 1996, The Lancet.

[32]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[33]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[34]  Suh,et al.  Phrasal Verbs in English , 2003 .

[35]  Anthony McEnery,et al.  The UCREL Semantic Analysis System , 2004 .

[36]  Aravind K. Joshi,et al.  Statistical ltag parsing , 2006 .

[37]  Timothy Baldwin,et al.  Bootstrapping Deep Lexical Resources: Resources for Courses , 2005, ACL 2005.

[38]  Mirella Lapata,et al.  A comparison of parsing technologies for the biomedical domain , 2005, Natural Language Engineering.

[39]  Jeremy Nicholson,et al.  Statistical interpretation of compound nouns , 2005 .

[40]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.