Semantics-Driven Recognition of Collocations Using Word Embeddings

L2 learners often produce “ungrammatical” word combinations such as, e.g., *give a suggestion or *make a walk. This is because of the “collocationality” of one of their items (the base) that limits the acceptance of collocates to express a specific meaning (‘perform’ above). We propose an algorithm that delivers, for a given base and the intended meaning of a collocate, the actual collocate lexeme(s) (make / take above). The algorithm exploits the linear mapping between bases and collocates from examples and generates a collocation transformation matrix which is then applied to novel unseen cases. The evaluation shows a promising line of research in collocation discovery.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[3]  Alexander F. Gelbukh,et al.  Semantic Analysis of Verbal Collocations with Lexical Functions , 2013, Studies in Computational Intelligence.

[4]  Adam Kilgarriff,et al.  Collocationality (and how to measure it) , 2006 .

[5]  Hwee Tou Ng,et al.  Correcting Semantic Collocation Errors with L1-induced Paraphrases , 2011, EMNLP.

[6]  Xiaolong Wang,et al.  Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks , 2014, BioMed research international.

[7]  Zhao-Ming Gao Automatic Identification of English Collocation Errors Based on Dependency Relations , 2013, PACLIC.

[8]  Orsolya Vincze,et al.  Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora , 2010, LREC.

[9]  Stefan Evert,et al.  Corpora and collocations , 2007 .

[10]  Roberto Navigli,et al.  NASARI: a Novel Approach to a Semantically-Aware Representation of Items , 2015, NAACL.

[11]  Gabriela Ferraro,et al.  Towards Distributional Semantics-Based Classification of Collocations for Collocation Dictionaries , 2016 .

[12]  Gabriela Ferraro,et al.  Can we determine the semantics of collocations without using semantics , 2013 .

[13]  Leo Wanner,et al.  Making sense of collocations , 2006, Comput. Speech Lang..

[14]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[15]  Pavel Pecina AMachine Learning Approach to Multiword Expression Extraction , 2008 .

[16]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[17]  Charles L. A. Clarke,et al.  Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings , 2015, ACL.

[18]  Nadja Nesselhauf,et al.  Collocations in a Learner Corpus , 2005 .

[19]  Martin Chodorow,et al.  A computational approach to detecting collocation errors in the writing of non-native speakers of English , 2008 .

[20]  Sylviane Granger,et al.  Prefabricated patterns in advanced EFL writing: collocations and formulae , 1998 .

[21]  Jason S. Chang,et al.  Automatic Collocation Suggestion in Academic Writing , 2010, ACL.

[22]  Yaacov Choueka,et al.  Looking for Needles in a Haystack or Locating Interesting Collocational Expressions in Large Textual Databases , 1988, RIAO Conference.

[23]  Trevor Cohn,et al.  A Neural Network Model for Low-Resource Universal Dependency Parsing , 2015, EMNLP.

[24]  Gerlof Bouma Collocation Extraction beyond the Independence Assumption , 2010, ACL.

[25]  E. K. Blau Teaching Collocation—Further Developments in the Lexical Approach , 2002 .

[26]  Roberto Carlini,et al.  Improving Collocation Correction by Ranking Suggestions Using Linguistic Knowledge , 2014 .

[27]  Pascal Poupart,et al.  Is the sky pure today? AwkChecker: an assistive tool for detecting and correcting collocation errors , 2008, UIST '08.

[28]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.

[29]  K. Goodman,et al.  Encyclopedia of Language and Linguistics , 2006 .

[30]  Wanxiang Che,et al.  Learning Semantic Hierarchies via Word Embeddings , 2014, ACL.

[31]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[32]  Jens Bahns,et al.  Should we teach EFL students collocations , 1993 .

[33]  Roberto Carlini,et al.  Example-based Acquisition of Fine-grained Collocation Resources , 2016, LREC.

[34]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[35]  Igor Mel’čuk,et al.  Lexical functions: a tool for the description of lexical relations in a lexicon , 1996 .