论文信息 - Adapting Sequence Models for Sentence Correction - 字舞流文

Adapting Sequence Models for Sentence Correction

In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. Our strongest sequence-to-sequence model improves over our strongest phrase-based statistical machine translation model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally, in the data environment of the standard CoNLL-2014 setup, we demonstrate that modeling (and tuning against) diffs yields similar or better M2 scores with simpler models and/or significantly less data than previous sequence-to-sequence approaches.

Alexander M. Rush | Yoon Kim | Stuart M. Shieber | Allen Schmaltz

[1] Raymond Hendy Susanto,et al. The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[2] Timothy Baldwin,et al. Randomized Significance Tests in Machine Translation , 2014, WMT@ACL.

[3] Jianfeng Gao,et al. A Nested Attention Neural Hybrid Model for Grammatical Error Correction , 2017, ACL.

[4] Adam Kilgarriff,et al. Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.

[5] Robert Dale,et al. HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task , 2012, BEA@NAACL-HLT.

[6] Shamil Chollampatt,et al. Neural Network Translation Models for Grammatical Error Correction , 2016, IJCAI.

[7] Rafael E. Banchs,et al. A Report on the Automatic Evaluation of Scientific Writing Shared Task , 2016, BEA@NAACL-HLT.

[8] Yuji Matsumoto,et al. Tense and Aspect Error Correction for ESL Learners Using Global Context , 2012, ACL.

[9] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[10] Alexander M. Rush,et al. Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction , 2016, BEA@NAACL-HLT.

[11] Hwee Tou Ng,et al. Better Evaluation for Grammatical Error Correction , 2012, NAACL.

[12] Ted Briscoe,et al. Grammatical error correction using neural machine translation , 2016, NAACL.

[13] Yuji Matsumoto,et al. The Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings , 2012, COLING.

[14] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[15] Shamil Chollampatt,et al. Exploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction , 2016, IJCAI.

[16] Dan Roth,et al. Grammatical Error Correction: Machine Translation and Classifiers , 2016, ACL.

[17] D Nicholls,et al. The Cambridge Learner Corpus-Error coding and analysis , 1999 .

[18] Matt Post,et al. GLEU Without Tuning , 2016, ArXiv.

[19] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[20] Daniel Jurafsky,et al. Neural Language Correction with Character-Based Attention , 2016, ArXiv.

[21] Nadir Durrani,et al. Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? , 2013, ACL.

[22] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[23] Hwee Tou Ng,et al. The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[24] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.

[25] Shamil Chollampatt,et al. Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models , 2016, EMNLP.

[26] Marcin Junczys-Dowmunt,et al. Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction , 2016, EMNLP.

[27] Hwee Tou Ng,et al. Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.