论文信息 - Noisy Channel for Low Resource Grammatical Error Correction

Noisy Channel for Low Resource Grammatical Error Correction

This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

[1] Hwee Tou Ng,et al. The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[2] Hui Lin,et al. A Simple but Effective Classification Model for Grammatical Error Correction , 2018, ArXiv.

[3] Kenneth Ward Church,et al. A Spelling Correction Program Based on a Noisy Channel Model , 1990, COLING.

[4] Ted Briscoe,et al. Language Model Based Grammatical Error Correction without Annotated Training Data , 2018, BEA@NAACL-HLT.

[5] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[6] Ted Briscoe,et al. The BEA-2019 Shared Task on Grammatical Error Correction , 2019, BEA@ACL.

[7] Rafael E. Banchs,et al. A Report on the Automatic Evaluation of Scientific Writing Shared Task , 2016, BEA@NAACL-HLT.

[8] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[9] Lior Rokach,et al. Choosing the Right Word: Using Bidirectional LSTM Tagger for Writing Support Systems , 2019, Eng. Appl. Artif. Intell..

[10] Adam Kilgarriff,et al. Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.

[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12] Robert L. Mercer,et al. Context based spelling correction , 1991, Inf. Process. Manag..

[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[14] Raymond Hendy Susanto,et al. The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[15] Marcin Junczys-Dowmunt,et al. The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction , 2014, PolTAL.

[16] Christo Kirov,et al. Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms , 2016, LREC.

[17] Marcin Junczys-Dowmunt,et al. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation , 2018, NAACL.

[18] James H. Martin,et al. Speech and Language Processing, 2nd Edition , 2008 .

[19] Ming Zhou,et al. Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study , 2018, ArXiv.