More Consensus Than Idiosyncrasy: Categorizing Social Judgments to Examine Variability in Mini-CEX Ratings

Purpose Social judgment research suggests that rater unreliability in performance assessments arises from raters’ differing inferences about the performer and the underlying reasons for the performance observed. These varying social judgments are not entirely idiosyncratic but, rather, tend to partition into a finite number of distinct subgroups, suggesting some “signal” in the “noise” of interrater variability. The authors investigated the proportion of variance in Mini-CEX ratings attributable to such partitions of raters’ social judgments about residents. Method In 2012 and 2013, physicians reviewed video-recorded patient encounters for seven residents, completed a Mini-CEX, and described their social judgments of the residents. Additional participants sorted these descriptions, which were analyzed using latent partition analysis (LPA). The best-fitting set of partitions for each resident served as an independent variable in a one-way ANOVA to determine the proportion of variance explained in Mini-CEX ratings. Results Forty-eight physicians rated at least one resident (34 assessed all seven). The seven sets of social judgments were sorted by 14 participants. Across residents, 2 to 5 partitions (mode: 4) provided a good LPA fit, suggesting that subgroups of raters were making similar social judgments, while different causal explanations for each resident’s performance existed across subgroups. The partitions accounted for 9% to 57% of the variance in Mini-CEX ratings across residents (mean = 32%). Conclusions These findings suggest that multiple “signals” do exist within the “noise” of interrater variability in performance-based assessment. It may be valuable to understand and exploit these multiple signals rather than try to eliminate them.

[1]  C. van Barneveld The Dependability of Medical Students’ Performance Ratings as Documented on In-Training Evaluations , 2005, Academic medicine : journal of the Association of American Medical Colleges.

[2]  B. Clauser,et al.  Use of the Mini-Clinical Evaluation Exercise to Rate Examinee Performance on a Multiple-Station Clinical Skills Examination: A Validity Study , 2006, Academic medicine : journal of the Association of American Medical Colleges.

[3]  D. Wiley,et al.  Latent partition analysis , 1967, Psychometrika.

[4]  J. Kogan,et al.  What Drives Faculty Ratings of Residents' Clinical Skills? The Impact of Faculty's Own Clinical Skills , 2010, Academic medicine : journal of the Association of American Medical Colleges.

[5]  Timothy J. Wood,et al.  Exploring the role of first impressions in rater-based assessments , 2014, Advances in health sciences education : theory and practice.

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  Sarit Kraus,et al.  Aggregating social behavior into person models: perceiver-induced consistency. , 1994, Journal of personality and social psychology.

[8]  C. P. M. Vleuten,et al.  Supervisor assessment of clinical and professional competence of medical trainees: a reliability study using workplace data and a focused analytical literature review , 2011, Advances in health sciences education : theory and practice.

[9]  D M Miller,et al.  Categorization Methodology: an Approach to the Collection and Analysis of Certain Classes of Qualitative Information. , 1986, Multivariate behavioral research.

[10]  C VanBarneveld,et al.  The dependability of medical students' performance ratings as documented on in-training evaluations. , 2005 .

[11]  D. A. Kenny,et al.  The how and why of disagreement among perceivers: An exploration of person models , 2006 .

[12]  A. Muijtjens,et al.  Workplace-based assessment: raters’ performance theories and constructs , 2012, Advances in Health Sciences Education.

[13]  Wayne Woloschuk,et al.  Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs , 2008, Advances in health sciences education : theory and practice.

[14]  J. Norcini,et al.  The Mini-CEX: A Method for Assessing Clinical Skills , 2003, Annals of Internal Medicine.

[15]  J. Kogan,et al.  Opening the black box of clinical skills assessment via observation: a conceptual model , 2011, Medical education.

[16]  C. Kreiter,et al.  Examining the Generalizability of Ratings across Clerkships Using a Clinical Evaluation Form , 2001, Evaluation & the health professions.

[17]  N. Pennington,et al.  Explanation-based decision making: effects of memory structure on judgment , 1988 .

[18]  V. Pankratz,et al.  Internal structure of mini-CEX scores for internal medicine residents: factor analysis and generalizability , 2010, Advances in health sciences education : theory and practice.

[19]  Kevin W Eva,et al.  Rater-Based Assessments as Social Judgments: Rethinking the Etiology of Rater Errors , 2011, Academic medicine : journal of the Association of American Medical Colleges.