Annotating ESL Errors: Challenges and Rewards

In this paper, we present a corrected and error-tagged corpus of essays written by non-native speakers of English. The corpus contains 63000 words and includes data by learners of English of nine first language backgrounds. The annotation was performed at the sentence level and involved correcting all errors in the sentence. Error classification includes mistakes in preposition and article usage, errors in grammar, word order, and word choice. We show an analysis of errors in the annotated corpus by error categories and first language backgrounds, as well as inter-annotator agreement on the task. We also describe a computer program that was developed to facilitate and standardize the annotation procedure for the task. The program allows for the annotation of various types of mistakes and was used in the annotation of the corpus.

[1]  Dan Roth,et al.  Training Paradigms for Correcting Errors in Grammar and Usage , 2010, NAACL.

[2]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[3]  Martin Chodorow,et al.  The Ups and Downs of Preposition Error Detection in ESL Writing , 2008, COLING.

[4]  John D. Kelleher,et al.  Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions , 2007 .

[5]  Na-Rae Han,et al.  Detection of Grammatical Errors Involving Prepositions , 2007, ACL 2007.

[6]  Sylviane Granger,et al.  Computer-Aided Error Analysis. , 1998 .

[7]  Rachele De Felice,et al.  A Classifier-Based Approach to Preposition and Determiner Error Correction in L2 English , 2008, COLING.

[8]  Dan Roth,et al.  A Winnow-Based Approach to Context-Sensitive Spelling Correction , 1998, Machine Learning.

[9]  Jianfeng Gao,et al.  Using Contextual Speller Techniques and Language Modeling for ESL Error Correction , 2008, IJCNLP.

[10]  Sylviane Granger,et al.  A Bird’s-eye view of learner corpus research , 2002 .

[11]  Gerard M. Dalgish Computer-Assisted ESL Research. , 1984 .

[12]  Martin Chodorow,et al.  Native Judgments of Non-Native Usage: Experiments in Preposition Error Detection , 2008, COLING 2008.

[13]  Naoki Isu,et al.  A Feedback-Augmented Method for Detecting Errors in the Writing of Learners of English , 2006, ACL.

[14]  Norma A. Pravec Survey of learner corpora , 2002 .

[15]  Hitoshi Isahara,et al.  The Overview of the SST Speech Corpus of Japanese Learner English and Evaluation Through the Experiment on Automatic Detection of Learners' Errors , 2004, LREC.

[16]  RothDan,et al.  A Winnow-Based Approach to Context-Sensitive Spelling Correction , 1999 .

[17]  Erik Smitterberg,et al.  International Corpus of Learner English , 2004 .

[18]  Dan Roth,et al.  Scaling Up Context-Sensitive Text Correction , 2001, IAAI.

[19]  Hitoshi Isahara,et al.  Automatic Error Detection in the Japanese Learners’ English Spoken Data , 2003, ACL.

[20]  Jens Eeg-Olofsson,et al.  Automatic Grammar Checking for Second Language Learners – the Use of Prepositions , 2003 .

[21]  Kiyotaka Uchimoto,et al.  The NICT JLE Corpus Exploiting the language learners' speech database for research and education , 2004 .

[22]  N. A-R A E H A N,et al.  Detecting errors in English article usage by non-native speakers , 2006 .

[23]  Ana Díaz-Negrillo,et al.  ERROR TAGGING SYSTEMS FOR LEARNER CORPORA , 2006 .

[24]  Sylviane Granger,et al.  The International Corpus of Learner English , 1993 .

[25]  John Bitchener,et al.  The Effect of Different Types of Corrective Feedback on ESL Student Writing. , 2005 .