A Reassessment of Reference-Based Grammatical Error Correction Metrics
暂无分享,去创建一个
[1] Ondrej Bojar,et al. Results of the WMT13 Metrics Shared Task , 2015, WMT@EMNLP.
[2] Marcin Junczys-Dowmunt,et al. Human Evaluation of Grammatical Error Correction Systems , 2015, EMNLP.
[3] Matt Post,et al. Efficient Elicitation of Annotations for Human Evaluation of Machine Translation , 2014, WMT@ACL.
[4] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[5] Ted Briscoe,et al. Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction , 2017, ACL.
[6] Hwee Tou Ng,et al. How Far are We from Fully Automatic High Quality Grammatical Error Correction? , 2015, ACL.
[7] S. Lewis,et al. Regression analysis , 2007, Practical Neurology.
[8] Matt Post,et al. Ground Truth for Grammatical Error Correction Metrics , 2015, ACL.
[9] Philipp Koehn,et al. Ten Years of WMT Evaluation Campaigns: Lessons Learnt , 2016 .
[10] Ondrej Bojar,et al. Results of the WMT14 Metrics Shared Task , 2013 .
[11] Raymond Hendy Susanto,et al. The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .
[12] Helen Yannakoudakis,et al. Compositional Sequence Labeling Models for Error Detection in Learner Writing , 2016, ACL.
[13] Nizar Habash,et al. The Second QALB Shared Task on Automatic Text Correction for Arabic , 2015, ANLP@ACL.
[14] Hwee Tou Ng,et al. The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.
[15] Matt Post,et al. Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality , 2016, TACL.
[16] Adam Kilgarriff,et al. Helping Our Own: The HOO 2011 Pilot Shared Task , 2011, ENLG.
[17] Joel R. Tetreault,et al. There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction , 2016, EMNLP.
[18] Hwee Tou Ng,et al. Better Evaluation for Grammatical Error Correction , 2012, NAACL.
[19] Ted Briscoe,et al. Towards a standard evaluation method for grammatical error detection and correction , 2015, NAACL.
[20] Ondrej Bojar,et al. Results of the WMT16 Metrics Shared Task , 2016 .
[21] Matt Post,et al. GLEU Without Tuning , 2016, ArXiv.
[22] Ondrej Bojar,et al. Results of the WMT17 Metrics Shared Task , 2017, WMT.
[23] Philipp Koehn,et al. Findings of the 2013 Workshop on Statistical Machine Translation , 2013, WMT@ACL.
[24] Philipp Koehn,et al. Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.
[25] Joel R. Tetreault,et al. GEC into the future: Where are we going and how do we get there? , 2017, BEA@EMNLP.
[26] Timothy Baldwin,et al. Testing for Significance of Increased Correlation with Human Judgment , 2014, EMNLP.
[27] Kentaro Inui,et al. Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems , 2017, IJCNLP.
[28] Martin Chodorow,et al. Problems in Evaluating Grammatical Error Detection Systems , 2012, COLING.
[29] Robert Dale,et al. HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task , 2012, BEA@NAACL-HLT.