Score Resolution and the Interrater Reliability of Holistic Scores in Rating Essays

The assessment of students' writing skills through essays is a common practice in educational institutions. Scoring of essays requires considerable judgment on the part of those who rate the response. When raters assign different scores to an essay, testing practitioners must resolve the discrepancy before computing an operational score to report to the examinee. This study investigated five forms of score resolution that were reported in a national survey of state department of education-testing agencies. The study examined the effect that each form of resolution has on the reliability of the resulting operational scores. It is shown that some methods of resolution can be associated with higher interrater reliability than can others. It is also shown that the choice of resolution can affect the magnitude of the reported score as well as the final passing rate of an assessment.