Rater types in writing performance assessments: A classification approach to rater variability
暂无分享,去创建一个
[1] Eli Hinkel. Native and Nonnative Speakers' Pragmatic Interpretations of English Texts. , 1994 .
[2] Brian K. Lynch,et al. Investigating variability in tasks and rater judgements in a performance test of foreign language speaking , 1995 .
[3] Jessica Williams,et al. Exploring the Dynamics of Second Language Writing , 2004 .
[4] H. Chow. "Holistic assessment : what goes on in the raters' minds?" , 1997 .
[5] Hans-Hermann Bock,et al. Two-mode clustering methods: astructuredoverview , 2004, Statistical methods in medical research.
[6] Alison Green,et al. Verbal Protocol Analysis in Language Testing Research: A Handbook , 1998 .
[7] Louis Guttman,et al. A STRUCTURAL THEORY FOR INTERGROUP BELIEFS AND ACTION , 1959 .
[8] W. Castillo,et al. Recurrence Properties in Two-Mode Hierarchical Clustering , 2000 .
[9] Edward W. Wolfe,et al. The relationship between essay reading style and scoring proficiency in a psychometric scoring system , 1997 .
[10] Thomas Eckes. Beurteilerübereinstimmung und Beurteilerstrenge , 2004 .
[11] Wolfgang Gaul,et al. From Data to Knowledge: Theoretical and Practical Aspects of Classification, Data Analysis, and Knowledge Organization , 1996 .
[12] Lawrence T. DeCarlo,et al. A Model of Rater Behavior in Essay Grading Based on Signal Detection Theory , 2005 .
[13] Mary L. DeRemer,et al. Writing assessment: Raters' elaboration of the rating task , 1998 .
[14] Michael Ranney,et al. Cognitive Differences in Proficient and Nonproficient Essay Scorers , 1998 .
[15] Tom Lumley,et al. Rater characteristics and rater bias: implications for training , 1995 .
[16] Alister Cumming,et al. Decision Making While Rating ESL/EFL Writing Tasks: A Descriptive Framework. , 2002 .
[17] L. Hamp-Lyons. Exploring the Dynamics of Second Language Writing: Writing teachers as assessors of writing , 2003 .
[18] Thomas Eckes,et al. Examining Rater Effects in TestDaF Writing and Speaking Performance Assessments: A Many-Facet Rasch Analysis , 2005 .
[19] S. Freedman. How Characteristics of Student Essays Influence Teachers ' Evaluations , 2005 .
[20] B. Huot,et al. Validating holistic scoring for writing assessment : theoretical and empirical foundations , 1993 .
[21] S. Cushing. Using FACETS to model rater training effects , 1998 .
[22] Carol M. Myford,et al. READER CALIBRATION AND ITS POTENTIAL ROLE IN EQUATING FOR THE TEST OF WRITTEN ENGLISH , 1995 .
[23] Thomas Eckes,et al. An Agglomerative Method for Two-Mode Hierarchical Clustering , 1991 .
[24] Edward W Wolfe,et al. Detecting and measuring rater effects using many-facet Rasch measurement: part I. , 2003, Journal of applied measurement.
[25] Tom Lumley,et al. Research methods in language testing , 2005 .
[26] P. Orlik,et al. An error variance approach to two-mode hierarchical clustering , 1993 .
[27] Alfred Appiah Sakyi. Validation of holistic scoring for ESL writing assessment: How raters evaluate compositions , 2000 .
[28] Rob Schoonen,et al. Generalizability of writing scores: an application of structural equation modeling , 2005 .
[29] Edward W Wolfe,et al. Detecting and measuring rater effects using many-facet Rasch measurement: Part II. , 2004, Journal of applied measurement.
[30] T. McNamara. Measuring Second Language Performance , 1996 .
[31] Tom Lumley,et al. Assessing second language writing : the rater's perspective , 2005 .
[32] L. Hubert,et al. Additive two-mode clustering: The error-variance approach revisited , 1995 .
[33] R. M. Smith,et al. Fit analysis in latent trait measurement models. , 2000, Journal of applied measurement.
[34] George Engelhard,et al. Examining Rater Errors in the Assessment of Written Composition With a Many-Faceted Rasch Model , 1994 .
[35] Daniel J. Reed,et al. Revisiting raters and ratings in oral language assessment , 2001 .
[36] Sara Cushing Weigle,et al. Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches , 1999 .
[37] Cyril J. Weir,et al. Language Testing and Validation , 2005 .
[38] T. McNamara. Item Response Theory and the validation of an ESP test for health professionals , 1990 .
[39] P. Congdon,et al. The Stability of Rater Severity in Large‐Scale Assessment Programs , 2000 .
[40] Thomas Eckes,et al. Recent Developments in Multimode Clustering , 1996 .
[41] M. Vergeer,et al. The assessment of writing ability: expert readers versus lay readers , 1997 .
[42] George Engelhard,et al. MONITORING FACULTY CONSULTANT PERFORMANCE IN THE ADVANCED PLACEMENT ENGLISH LITERATURE AND COMPOSITION PROGRAM WITH A MANY-FACETED RASCH MODEL , 2003 .
[43] Lyle F. Bachman,et al. Language testing in practice : designing and developing useful language tests , 1996 .
[44] F. Lievens,et al. Assessor training strategies and their effects on accuracy, interrater reliability, and discriminant validity. , 2001, The Journal of applied psychology.
[45] Sara Cushing Weigle,et al. Effects of training on raters of ESL compositions , 1994 .
[46] Alister Cumming,et al. Expertise in evaluating second language compositions , 1990 .
[47] William T. Hoyt,et al. Magnitude and moderators of bias in observer ratings: A meta-analysis. , 1999 .
[48] S. Barrett. The impact of training on rater variability , 2001 .
[49] Sara Cushing Weigle,et al. Using FACETS to model rater training effects , 1998 .
[50] Individual Feedback to Enhance Rater Training: Does It Work? , 2005 .
[51] John M Linacre,et al. Optimizing rating scale category effectiveness. , 2002, Journal of applied measurement.
[52] Elaine D. Pulakos,et al. The development of training programs to increase accuracy with different rating tasks , 1986 .
[53] James Dean Brown. Do English and ESL Faculties Rate Writing Samples Differently , 1991 .
[54] T. Lumley. Assessment criteria in a large-scale writing test: what do they really mean to the raters? , 2002 .
[55] Mohammad Ali,et al. Performance Assessment in Language Testing. , 2008 .
[56] C. Weir. Language Testing and Validation: An Evidence-Based Approach , 2004 .