Building a State-of-the-Art Grammatical Error Correction System

This paper identifies and examines the key principles underlying building a state-of-the-art grammatical error correction system. We do this by analyzing the Illinois system that placed first among seventeen teams in the recent CoNLL-2013 shared task on grammatical error correction. The system focuses on five different types of errors common among non-native English writers. We describe four design principles that are relevant for correcting all of these errors, analyze the system along these dimensions, and show how each of these dimensions contributes to the performance.

[1]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[2]  Xiaodong Zeng,et al.  UM-Checker: A Hybrid System for English Grammatical Error Correction , 2013, CoNLL Shared Task.

[3]  Claudia Leacock,et al.  Automated Grammatical Error Correction for Language Learners , 2010, COLING.

[4]  Dan Roth,et al.  Learning Based Java for Rapid Development of NLP Systems , 2010, LREC.

[5]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[6]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[7]  Maria Luisa Zubizarreta,et al.  Sources of linguistic knowledge in the second language acquisition of English articles , 2008 .

[8]  Radford,et al.  转换生成语法教程 = Transformational Grammar , 2000 .

[9]  Dan Roth,et al.  Annotating ESL Errors: Challenges and Rewards , 2010 .

[10]  Dan Roth,et al.  Algorithm Selection and Model Adaptation for ESL Correction Tasks , 2011, ACL.

[11]  Na-Rae Han,et al.  Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System , 2010, LREC.

[12]  Dan Roth,et al.  Correcting Grammatical Verb Errors , 2014, EACL.

[13]  Yang Xiang,et al.  A Hybrid Model For Grammatical Error Correction , 2013, CoNLL Shared Task.

[14]  Adam Kilgarriff,et al.  Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.

[15]  Veronika Vincze,et al.  LFG-based Features for Noun Number and Article Grammatical Errors , 2013, CoNLL Shared Task.

[16]  Michael Gamon,et al.  Using Mostly Native Data to Correct Errors in Learners’ Writing , 2010, NAACL.

[17]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[18]  Helen Yannakoudakis,et al.  A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.

[19]  Hwee Tou Ng,et al.  The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[20]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[21]  Hwee Tou Ng,et al.  Grammatical Error Correction with Alternating Structure Optimization , 2011, ACL.

[22]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[23]  Nizar Habash,et al.  The Illinois-Columbia System in the CoNLL-2014 Shared Task , 2014, CoNLL Shared Task.

[24]  Raymond Hendy Susanto,et al.  The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[25]  Yu-Wei Chang,et al.  CoNLL-2013 Shared Task: Grammatical Error Correction NTHU System Description , 2013, CoNLL Shared Task.

[26]  Dan Roth,et al.  The University of Illinois System in the CoNLL-2013 Shared Task , 2013, CoNLL Shared Task.

[27]  Nitin Madnani,et al.  Robust Systems for Preposition Error Correction Using Wikipedia Revisions , 2013, NAACL.

[28]  Jennifer Foster,et al.  Using Parse Features for Preposition Selection and Error Detection , 2010, ACL.

[29]  Martin Chodorow,et al.  Problems in Evaluating Grammatical Error Detection Systems , 2012, COLING.

[30]  Robert Dale,et al.  HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task , 2012, BEA@NAACL-HLT.

[31]  S. Gass,et al.  Language transfer in language learning , 1985 .

[32]  Nitin Madnani,et al.  They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems , 2011, ACL.

[33]  Dan Roth,et al.  University of Illinois System in HOO Text Correction Shared Task , 2011, ENLG.

[34]  Yuji Matsumoto,et al.  NAIST at 2013 CoNLL Grammatical Error Correction Shared Task , 2013, CoNLL Shared Task.

[35]  Stephanie Seneff,et al.  Correcting Misuse of Verb Forms , 2008, ACL.

[36]  Jennifer Foster,et al.  Treebanks Gone Bad: Generating a Treebank of Ungrammatical English , 2007 .

[37]  Hwee Tou Ng,et al.  A Beam-Search Decoder for Grammatical Error Correction , 2012, EMNLP.

[38]  Jennifer Foster,et al.  GenERRate: Generating Errors for Use in Grammatical Error Detection , 2009, BEA@NAACL.

[39]  Dan Roth,et al.  Joint Learning and Inference for Grammatical Error Correction , 2013, EMNLP.

[40]  Dan Roth,et al.  The UI System in the HOO 2012 Shared Task on Error Correction , 2012, BEA@NAACL-HLT.

[41]  Joachim Wagner,et al.  Detecting grammatical errors with treebank-induced, probabilistic parsers , 2012 .

[42]  Hwee Tou Ng,et al.  Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.

[43]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[44]  Dan Roth,et al.  Training Paradigms for Correcting Errors in Grammar and Usage , 2010, NAACL.

[45]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[46]  Hitoshi Isahara,et al.  Automatic Error Detection in the Japanese Learners’ English Spoken Data , 2003, ACL.

[47]  N. A-R A E H A N,et al.  Detecting errors in English article usage by non-native speakers , 2006 .