论文信息 - Measurement with judges: Many-faceted conjoint measurement

Measurement with judges: Many-faceted conjoint measurement

Abstract One of the major problems in assessment and evaluation is that different people rate the same performance with varying degrees of severity. Individual raters vary the severity of their ratings in a manner dependent upon a wide array of factors. Most efforts intended to secure reliable and valid ratings across judges assume that the goal is to obtain identical ratings from different judges for the same performance. In contrast to these approaches, probabilistic conjoint measurement facilitates observation and calibration of differences in judge severity, making it possible to account for these differences in the interpretation of the assigned ratings. This chapter addresses issues in the application of Facets analyses to writing assessment, aesthetic judgment, and the evaluation of public speaking ability.

[1] R. J. Kibler,et al. An empirical study of overlap rating effects , 1968 .

[2] Donald R. Williams,et al. A Note on the Determination of Connectedness in an N-Way Cross Classification , 1964 .

[3] George Engelhard,et al. The Measurement of Writing Ability With a Many-Faceted Rasch Model , 1992 .

[4] Georg Rasch,et al. Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[5] L. Hedges,et al. Statistical Methods for Meta-Analysis , 1987 .

[6] Mary E. Lunz,et al. Measuring the Impact of Judge Severity on Examination Scores , 1990 .

[7] Sandra Murphy,et al. Designing Writing Tasks for the Assessment of Writing , 1988 .

[8] J. Linacre,et al. Many-facet Rasch measurement , 1994 .

[9] Training speech raters with films , 1964 .