Generalizability of Text Quality Scores

This chapter explores the generalizability of ratings on text quality of ninth-grade students in pre-university education and first-year university students. For first-year university students, one does not need to rate as many texts as for ninth-graders. Especially, when writing in a foreign language, first-year students prove to be very stable writers. The stability of text quality scores appears to depend on the way texts are rated as well. So-called analytic scoring schemes seem to result in reliable but topic-dependent text quality scores. Holistic ratings, on the other hand, appear to give raters less support (resulting in a lower inter-rater agreement) but result in less topic-dependent scores. Hence, in order to generalize, writers do not need to write that many texts if these texts are holistically rated. Implications of these results for experimental studies on writing and educational effectiveness of writing pedagogies are discussed. Keywords:analytic scoring schemes; first-year students; generalizability; holistic ratings; ninth-graders; text quality scores; writing skills