Crossing the border twice: Reimporting prepositions to alleviate L1-specific transfer errors

We present a data-driven approach which exploits word alignment in a large parallel corpus with the objective of identifying those verb- and adjective-preposition combinations which are difficult for L2 language learners. This allows us, on the one hand, to provide language-specific ranked lists in order to help learners to focus on particularly challenging combinations given their native language (L1). On the other hand, we provide extensive statistics on such combinations with the objective of facilitating automatic error correction for preposition use in learner texts. We evaluate these lists, first manually, and secondly automatically by applying our statistics to an error-correction task.

[1]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[2]  Sylviane Granger,et al.  From EFL to ESL: Evidence from the International Corpus of Learner English , 2008 .

[3]  Gaëtanelle Gilquin,et al.  The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach , 2015 .

[4]  Sylviane Granger,et al.  The International Corpus of Learner English , 1993 .

[5]  G. Natalia International Corpus of Learner English: Implications for ELT , 1998 .

[6]  Martin Chodorow,et al.  The Ups and Downs of Preposition Error Detection in ESL Writing , 2008, COLING.

[7]  Mark Davies,et al.  Pointing Out Frequent Phrasal Verbs: A Corpus‐Based Analysis , 2007 .

[8]  Hwee Tou Ng,et al.  The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[9]  Evelyn Benson,et al.  The BBI Combinatory Dictionary of English: Your guide to collocations and grammar. , 2010 .

[10]  Sylviane Granger,et al.  From General to Learners’ Bilingual Dictionaries: Towards a More Effective Fulfilment of Advanced Learners’ Phraseological Needs , 2016 .

[11]  Raymond Hendy Susanto,et al.  The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[12]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[13]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[14]  Guy Aston,et al.  The BNC handbook : コーパス言語学への誘い , 2004 .

[15]  Gerold Schneider,et al.  Detecting innovations in a parsed corpus of learner English , 2018, Rethinking Linguistic Creativity in Non-native Englishes.

[16]  András Kornai,et al.  Parallel corpora for medium density languages , 2007 .

[17]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[18]  Stephan Vogel,et al.  Parallel Implementations of Word Alignment Tool , 2008, SETQALNLP.

[19]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[20]  Walt Detmar Meurers,et al.  Informing Determiner and Preposition Error Correction with Hierarchical Word Clustering , 2012, BEA@NAACL-HLT.

[21]  Martin Volk,et al.  Building a Parallel Corpus on the World's Oldest Banking Magazine , 2016, KONVENS.

[22]  Helen Yannakoudakis,et al.  Grammatical error correction using hybrid systems and type filtering , 2014, CoNLL Shared Task.

[23]  Martin Volk,et al.  Cleaning the Europarl Corpus for Linguistic Applications , 2014, KONVENS.