Explaining ESL essay holistic scores: A multilevel modeling approach

This study adopted a multilevel modeling (MLM) approach to examine the contribution of rater and essay factors to variability in ESL essay holistic scores. Previous research aiming to explain variability in essay holistic scores has focused on either rater or essay factors. The few studies that have examined the contribution of more than one factor to variability in essay scores relied on analytic techniques that do not reflect the nested structure of essay ratings. One goal for this article is to illustrate the use and potential contributions of MLM to research on essay score variability. The study included 31 experienced and 29 novice raters who each rated a set of essays holistically and analytically. Scores were analyzed using MLM to examine the associations between essay features and holistic scores and the impact of rater experience on both essay holistic scores and these associations. The experienced raters assigned lower scores and gave more importance to linguistic accuracy than did the novices. Novices gave more importance to argumentation and their scores exhibited more variability. The article concludes by highlighting the value of MLM in identifying and estimating the contributions of various individual, textual and contextual factors in the rating context to variability in ESL essay scores.

[1]  Sara Cushing Weigle,et al.  Effects of training on raters of ESL compositions , 1994 .

[2]  Alister Cumming,et al.  Expertise in evaluating second language compositions , 1990 .

[3]  Tom Lumley,et al.  Assessing second language writing : the rater's perspective , 2005 .

[4]  Carol O. Sweedler-Brown The Influence of Training and Experience on Holistic Essay Evaluations. , 1985 .

[5]  D. J. Tedick,et al.  Holistic scoring in ESL writing assessment: What does an analysis of rhetorical features reveal?: Essays on research and pedagogy , 1995 .

[6]  Nancy Rost Goulden,et al.  Relationship of Analytic and Holistic Methods to Raters' Scores for Speeches. , 1994 .

[7]  K. O’Loughlin The Assessment of Writing by English and ESL Teachers. , 1994 .

[8]  Scott Jarvis,et al.  Exploring multiple profiles of highly rated learner compositions , 2003 .

[9]  Hunter M. Breland,et al.  COMPARABILITY OF TOEFL CBT WRITING PROMPTS: RESPONSE MODE ANALYSES , 2004 .

[10]  Carol O. Sweedler-Brown ESL Essay Evaluation: The Influence of Sentence-Level and Rhetorical Features. , 1993 .

[11]  T. Lumley Assessment criteria in a large-scale writing test: what do they really mean to the raters? , 2002 .

[12]  Jan de Leeuw,et al.  Introducing Multilevel Modeling , 1998 .

[13]  A. Roux Potentialities and Limitations of Multilevel Analysis in Public Health and Epidemiology , 2003 .

[14]  K. Barkaoui,et al.  Participants, Texts, and Processes in ESL/EFL Essay Tests: A Narrative Review of the Literature , 2007 .

[15]  T. Homburg Holistic Evaluation of ESL Compositions: Can It Be Validated Objectively? , 1984 .

[16]  Khaled Barkaoui Participants, texts, and processes in second language writing assessment: A narrative review of the literature. , 2007 .

[17]  Sara Cushing Weigle,et al.  Using FACETS to model rater training effects , 1998 .

[18]  Lawrence T. Frase,et al.  COMPUTER ANALYSIS OF THE TOEFL TEST OF WRITTEN ENGLISH , 1998 .

[19]  Alister Cumming,et al.  Decision Making While Rating ESL/EFL Writing Tasks: A Descriptive Framework. , 2002 .

[20]  Thomas Eckes,et al.  Rater types in writing performance assessments: A classification approach to rater variability , 2008 .

[21]  Hiroe Kobayashi,et al.  Differing Perceptions of EFL Writing among Readers in Japan. , 2001 .

[22]  Nancy Nesbitt Vacc Writing Evaluation: Examining Four Teachers' Holistic and Analytic Scores , 1989, The Elementary School Journal.

[23]  B. O’Sullivan,et al.  The Effect of Audience Age on Measured Written Performance. , 1999 .

[24]  Alister Cumming,et al.  Professor's Ratings of Language Use and Rhetorical Organizations in ESL Compositions , 1987 .

[25]  Robert N. Bickel,et al.  Multilevel Analysis for Applied Research: It's Just Regression! Methodology in the Social Sciences. , 2007 .

[26]  Bailin Song,et al.  Do English and ESL faculty differ in evaluating the essays of native English-speaking and ESL students? , 1996 .

[27]  Leslie Grant,et al.  Using Computer-Tagged Linguistic Features to Describe L2 Writing Differences , 2000 .