论文信息 - Addressing Class Imbalance in Grammatical Error Detection with Evaluation Metric Optimization

Addressing Class Imbalance in Grammatical Error Detection with Evaluation Metric Optimization

We address the problem of class imbalance in supervised grammatical error detection (GED) for non-native speaker text, which is the result of the low proportion of erroneous examples compared to a large number of error-free examples. Most learning algorithms maximize accuracy which is not a suitable objective for such imbalanced data. For GED, most systems address this issue by tuning hyperparameters to maximize metrics like Fβ . Instead, we show that learning classifiers that directly learn model parameters by optimizing evaluation metrics like F1 and F2 score deliver better performance on these metrics as compared to traditional sampling and cost-sensitive learning solutions for addressing class imbalance. Optimizing these metrics is useful in recall-oriented grammar error detection scenarios. We also show that there are inherent difficulties in optimizing precision-oriented evaluation metrics like F0.5. We establish this through a systematic evaluation on multiple datasets and different GED tasks.

Pushpak Bhattacharyya | Anoop Kunchukuttan

[1] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2] Pedro M. Domingos. MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[3] Hwee Tou Ng,et al. Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.

[4] Hwee Tou Ng,et al. A Beam-Search Decoder for Grammatical Error Correction , 2012, EMNLP.

[5] Taghi M. Khoshgoftaar,et al. Experimental perspectives on learning from imbalanced data , 2007, ICML '07.

[6] Pushpak Bhattacharyya,et al. IITB System for CoNLL 2013 Shared Task: A Hybrid Approach to Grammatical Error Correction , 2013, CoNLL Shared Task.

[7] Adam Kilgarriff,et al. Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.

[8] Thorsten Joachims,et al. A support vector method for multivariate performance measures , 2005, ICML.

[9] Dan Roth,et al. Generating Confusion Sets for Context-Sensitive Error Correction , 2010, EMNLP.

[10] Martin Chodorow,et al. Problems in Evaluating Grammatical Error Detection Systems , 2012, COLING.

[11] Hwee Tou Ng,et al. NUS at the HOO 2012 Shared Task , 2012, BEA@NAACL-HLT.