论文信息 - High-Order Sequence Modeling for Language Learner Error Detection

High-Order Sequence Modeling for Language Learner Error Detection

We address the problem of detecting English language learner errors by using a discriminative high-order sequence model. Unlike most work in error-detection, this method is agnostic as to specific error types, thus potentially allowing for higher recall across different error types. The approach integrates features from many sources into the error-detection model, ranging from language model-based features to linguistic analysis features. Evaluation results on a large annotated corpus of learner writing indicate the feasibility of our approach on a realistic, noisy and inherently skewed set of data. High-order models consistently outperform low-order models in our experiments. Error analysis on the output shows that the calculation of precision on the test set represents a lower bound on the real system performance.

Michael Gamon

[1] Johnny Bigert. Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge , 2002 .

[2] Martin Chodorow,et al. The Ups and Downs of Preposition Error Detection in ESL Writing , 2008, COLING.

[3] Andrew McCallum,et al. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences , 2003 .

[4] Roger Levy,et al. Automated Whole Sentence Grammar Correction Using a Noisy Channel Model , 2011, ACL.

[5] Andrew McCallum,et al. Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[6] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7] Na-Rae Han,et al. Detection of Grammatical Errors Involving Prepositions , 2007, ACL 2007.

[8] Dan Roth,et al. Training Paradigms for Correcting Errors in Grammar and Usage , 2010, NAACL.

[9] Dan Roth,et al. Generating Confusion Sets for Context-Sensitive Error Correction , 2010, EMNLP.

[10] Josef van Genabith,et al. Judging Grammaticality: Experiments in Sentence Classification , 2013, CALICO Journal.

[11] Jonas Sjöbergh. Chunking: an unsupervised method to find errors in text , 2005, NODALIDA.