Grammatical Error Detection Using Error- and Grammaticality-Specific Word Embeddings

In this study, we improve grammatical error detection by learning word embeddings that consider grammaticality and error patterns. Most existing algorithms for learning word embeddings usually model only the syntactic context of words so that classifiers treat erroneous and correct words as similar inputs. We address the problem of contextual information by considering learner errors. Specifically, we propose two models: one model that employs grammatical error patterns and another model that considers grammaticality of the target word. We determine grammaticality of n-gram sequence from the annotated error tags and extract grammatical error patterns for word embeddings from large-scale learner corpora. Experimental results show that a bidirectional long-short term memory model initialized by our word embeddings achieved the state-of-the-art accuracy by a large margin in an English grammatical error detection task on the First Certificate in English dataset.

[1]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[3]  Martin Chodorow,et al.  The Ups and Downs of Preposition Error Detection in ESL Writing , 2008, COLING.

[4]  Thorsten Brants,et al.  One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[5]  Ted Briscoe,et al.  Detecting Learner Errors in the Choice of Content Words Using Compositional Distributional Semantics , 2014, COLING.

[6]  Ming Zhou,et al.  SRL-Based Verb Selection for ESL , 2010, EMNLP.

[7]  Yuji Matsumoto,et al.  Mining Revision Log of Language Learning SNS for Automated Japanese Error Correction of Second Language Learners , 2011, IJCNLP.

[8]  Raymond Hendy Susanto,et al.  The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[9]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[10]  Shamil Chollampatt,et al.  Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models , 2016, EMNLP.

[11]  Helen Yannakoudakis,et al.  Compositional Sequence Labeling Models for Error Detection in Learner Writing , 2016, ACL.

[12]  Ryo Nagata,et al.  Evaluating performance of grammatical error detection to maximize learning effect , 2010, COLING.

[13]  Shamil Chollampatt,et al.  Neural Network Translation Models for Grammatical Error Correction , 2016, IJCAI.

[14]  Helen Yannakoudakis,et al.  A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.

[15]  Daniel Jurafsky,et al.  Neural Language Correction with Character-Based Attention , 2016, ArXiv.

[16]  N. A-R A E H A N,et al.  Detecting errors in English article usage by non-native speakers , 2006 .

[17]  Yuji Matsumoto,et al.  A Learner Corpus-based Approach to Verb Suggestion for ESL , 2013, ACL.

[18]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[19]  Helen Yannakoudakis,et al.  Automatic Text Scoring Using Neural Networks , 2016, ACL.