Annotation and Classification of Sentence-level Revision Improvement

Studies of writing revisions rarely focus on revision quality. To address this issue, we introduce a corpus of between-draft revisions of student argumentative essays, annotated as to whether each revision improves essay quality. We demonstrate a potential usage of our annotations by developing a machine learning model to predict revision improvement. With the goal of expanding training data, we also extract revisions from a dataset edited by expert proofreaders. Our results indicate that blending expert and non-expert revisions increases model performance, with expert data particularly important for predicting low-quality revisions.

[1]  Hwee Tou Ng,et al.  Better Evaluation for Grammatical Error Correction , 2012, NAACL.

[2]  Fan Zhang,et al.  A Corpus of Annotated Revisions for Studying Argumentative Writing , 2017, ACL.

[3]  Fan Zhang,et al.  Annotation and Classification of Argumentative Writing Revisions , 2015, BEA@NAACL-HLT.

[4]  Trena M. Paulus,et al.  The Effect of Peer and Teacher Feedback on Student Writing , 1999 .

[5]  Rafael E. Banchs,et al.  A Report on the Automatic Evaluation of Scientific Writing Shared Task , 2016, BEA@NAACL-HLT.

[6]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[7]  Iryna Gurevych,et al.  A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia Articles , 2012, COLING.

[8]  Michael Strube,et al.  Feature-Rich Error Detection in Scientific Writing Using Logistic Regression , 2016, BEA@NAACL-HLT.

[9]  Rebecca Hwa,et al.  Improved Correction Detection in Revised ESL Sentences , 2014, ACL.

[10]  Junyi Jessy Li,et al.  Fast and Accurate Prediction of Sentence Specificity , 2015, AAAI.

[11]  Christof Monz,et al.  User Edits Classification Using Document Revision Histories , 2012, EACL.

[12]  Paolo Rosso,et al.  Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features , 2011, CICLing.

[13]  Lillian Lee,et al.  A Corpus of Sentence-level Revisions in Academic Writing: A Step towards Understanding Statement Strength in Communication , 2014, ACL.

[14]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[15]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Iryna Gurevych,et al.  Automatically Classifying Edit Categories in Wikipedia Revisions , 2013, EMNLP.

[18]  Ted Briscoe,et al.  Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments , 2016, COLING.

[19]  Cristina V. Lopes,et al.  Vandalism detection in Wikipedia: a high-performing, feature-rich model and its reduction through Lasso , 2011, Int. Sym. Wikis.

[20]  Aaron Halfaker,et al.  Identifying Semantic Edit Intentions from Revisions in Wikipedia , 2017, EMNLP.