Grammatical Error Correction: Machine Translation and Classifiers

We focus on two leading state-of-the-art approaches to grammatical error correction – machine learning classification and machine translation. Based on the comparative study of the two learning frameworks and through error analysis of the output of the state-of-the-art systems, we identify key strengths and weaknesses of each of these approaches and demonstrate their complementarity. In particular, the machine translation method learns from parallel data without requiring further linguistic input and is better at correcting complex mistakes. The classification approach possesses other desirable characteristics, such as the ability to easily generalize beyond what was seen in training, the ability to train without human-annotated data, and the flexibility to adjust knowledge sources for individual error types. Based on this analysis, we develop an algorithmic approach that combines the strengths of both methods. We present several systems based on resources used in previous work with a relative improvement of over 20% (and 7.4 F score points) over the previous state-of-the-art.

[1]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[2]  Jennifer Foster,et al.  Using Parse Features for Preposition Selection and Error Detection , 2010, ACL.

[3]  Michael Flor,et al.  On using context for automatic correction of non-word misspellings in student essays , 2012, BEA@NAACL-HLT.

[4]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[5]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[6]  Adam Kilgarriff,et al.  Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.

[7]  Helen Yannakoudakis,et al.  Grammatical error correction using hybrid systems and type filtering , 2014, CoNLL Shared Task.

[8]  Dan Roth,et al.  A Winnow-Based Approach to Context-Sensitive Spelling Correction , 1998, Machine Learning.

[9]  Dan Roth,et al.  Learning Based Java for Rapid Development of NLP Systems , 2010, LREC.

[10]  Dan Roth,et al.  Algorithm Selection and Model Adaptation for ESL Correction Tasks , 2011, ACL.

[11]  Michael Gamon,et al.  Using Mostly Native Data to Correct Errors in Learners’ Writing , 2010, NAACL.

[12]  Hwee Tou Ng,et al.  Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.

[13]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[14]  Hwee Tou Ng,et al.  Grammatical Error Correction with Alternating Structure Optimization , 2011, ACL.

[15]  Jianfeng Gao,et al.  Using Contextual Speller Techniques and Language Modeling for ESL Error Correction , 2008, IJCNLP.

[16]  Yuji Matsumoto,et al.  Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation , 2016, NAACL.

[17]  Rachele De Felice,et al.  A Classifier-Based Approach to Preposition and Determiner Error Correction in L2 English , 2008, COLING.

[18]  Hitoshi Isahara,et al.  Automatic Error Detection in the Japanese Learners’ English Spoken Data , 2003, ACL.

[19]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[20]  Na-Rae Han,et al.  Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System , 2010, LREC.

[21]  Yuji Matsumoto,et al.  Mining Revision Log of Language Learning SNS for Automated Japanese Error Correction of Second Language Learners , 2011, IJCNLP.

[22]  Dan Roth,et al.  Training Paradigms for Correcting Errors in Grammar and Usage , 2010, NAACL.

[23]  Alon Lavie,et al.  Machine Translation System Combination with Flexible Word Ordering , 2009, WMT@EACL.

[24]  Dan Roth,et al.  Generating Confusion Sets for Context-Sensitive Error Correction , 2010, EMNLP.

[25]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[26]  Dan Roth,et al.  Scaling Up Context-Sensitive Text Correction , 2001, IAAI.

[27]  Raymond Hendy Susanto,et al.  The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[28]  Dan Roth,et al.  Building a State-of-the-Art Grammatical Error Correction System , 2014, TACL.

[29]  Hwee Tou Ng,et al.  A Beam-Search Decoder for Grammatical Error Correction , 2012, EMNLP.

[30]  Dan Roth,et al.  Joint Learning and Inference for Grammatical Error Correction , 2013, EMNLP.

[31]  Andrew Carlson,et al.  Memory-based context-sensitive spelling correction at web scale , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[32]  Michele Banko,et al.  Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[33]  Robert Dale,et al.  HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task , 2012, BEA@NAACL-HLT.

[34]  Dan Roth,et al.  University of Illinois System in HOO Text Correction Shared Task , 2011, ENLG.

[35]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[36]  Marcin Junczys-Dowmunt,et al.  The AMU System in the CoNLL-2014 Shared Task: Grammatical Error Correction by Data-Intensive and Feature-Rich Statistical Machine Translation , 2014, CoNLL Shared Task.

[37]  Hwee Tou Ng,et al.  System Combination for Grammatical Error Correction , 2014, EMNLP.

[38]  Roger Levy,et al.  Automated Whole Sentence Grammar Correction Using a Noisy Channel Model , 2011, ACL.

[39]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[40]  Nizar Habash,et al.  The Illinois-Columbia System in the CoNLL-2014 Shared Task , 2014, CoNLL Shared Task.

[41]  Hwee Tou Ng,et al.  The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[42]  Michael Flor,et al.  Four types of context for automatic spelling correction , 2012, TAL.

[43]  Hal Daumé,et al.  Domain Adaptation for Machine Translation by Mining Unseen Words , 2011, ACL.

[44]  N. A-R A E H A N,et al.  Detecting errors in English article usage by non-native speakers , 2006 .

[45]  Stephanie Seneff,et al.  An analysis of grammatical errors in non-native speech in english , 2008, 2008 IEEE Spoken Language Technology Workshop.