Measurement with judges: Many-faceted conjoint measurement

Abstract One of the major problems in assessment and evaluation is that different people rate the same performance with varying degrees of severity. Individual raters vary the severity of their ratings in a manner dependent upon a wide array of factors. Most efforts intended to secure reliable and valid ratings across judges assume that the goal is to obtain identical ratings from different judges for the same performance. In contrast to these approaches, probabilistic conjoint measurement facilitates observation and calibration of differences in judge severity, making it possible to account for these differences in the interpretation of the assigned ratings. This chapter addresses issues in the application of Facets analyses to writing assessment, aesthetic judgment, and the evaluation of public speaking ability.