Are Examiners’ Judgments in OSCE-Style Assessments Influenced by Contrast Effects?

Purpose Laboratory studies have shown that performance assessment judgments can be biased by “contrast effects.” Assessors’ scores become more positive, for example, when the assessed performance is preceded by relatively weak candidates. The authors queried whether this effect occurs in real, high-stakes performance assessments despite increased formality and behavioral descriptors. Method Data were obtained for the 2011 United Kingdom Foundational Programme clinical assessment and the 2008 University of Alberta Multiple Mini Interview. Candidate scores were compared with scores for immediately preceding candidates and progressively distant candidates. In addition, average scores for the preceding three candidates were calculated. Relationships between these variables were examined using linear regression. Results Negative relationships were observed between index scores and both immediately preceding and recent scores for all exam formats. Relationships were greater between index scores and the average of the three preceding scores. These effects persisted even when examiners had judged several performances, explaining up to 11% of observed variance on some occasions. Conclusions These findings suggest that contrast effects do influence examiner judgments in high-stakes performance-based assessments. Although the observed effect was smaller than observed in experimentally controlled laboratory studies, this is to be expected given that real-world data lessen the strength of the intervention by virtue of less distinct differences between candidates. Although it is possible that the format of circuital exams reduces examiners’ susceptibility to these influences, the finding of a persistent effect after examiners had judged several candidates suggests that the potential influence on candidate scores should not be ignored.

[1]  M Verma,et al.  Assessment of clinical competence , 1994 .

[2]  D. Ariely,et al.  “Coherent Arbitrariness”: Stable Demand Curves Without Stable Preferences , 2003 .

[3]  T. Mussweiler Comparison processes in social judgment: mechanisms and consequences. , 2003, Psychological review.

[4]  William C McGaghie,et al.  SPECIAL ARTICLE: Cognitive, Social and Environmental Sources of Bias in Clinical Performance Ratings , 2003, Teaching and learning in medicine.

[5]  David Newble,et al.  Techniques for measuring clinical competence: objective structured clinical examinations , 2004, Medical education.

[6]  Eric S Holmboe,et al.  Effects of Training in Direct Observation of Medical Residents' Clinical Competence , 2004, Annals of Internal Medicine.

[7]  L. Pangaro,et al.  Evaluation of a novel assessment form for observing medical residents: a randomised, controlled trial , 2008, Medical education.

[8]  David A. Cook,et al.  Effect of Rater Training on Reliability and Accuracy of Mini-CEX Scores: A Randomized, Controlled Trial , 2008, Journal of General Internal Medicine.

[9]  D. Cook,et al.  Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX , 2009, Advances in health sciences education : theory and practice.

[10]  S. Durning,et al.  Constructing a Validity Argument for the Mini-Clinical Evaluation Exercise: A Review of the Research , 2010, Academic medicine : journal of the Association of American Medical Colleges.

[11]  J. Kogan,et al.  What Drives Faculty Ratings of Residents' Clinical Skills? The Impact of Faculty's Own Clinical Skills , 2010, Academic medicine : journal of the Association of American Medical Colleges.

[12]  A. Muijtjens,et al.  Workplace-based assessment: effects of rater expertise , 2010, Advances in health sciences education : theory and practice.

[13]  J. Kogan,et al.  Opening the black box of clinical skills assessment via observation: a conceptual model , 2011, Medical education.

[14]  C. V. D. van der Vleuten,et al.  In-training assessment using direct observation of single-patient encounters: a literature review , 2010, Advances in health sciences education : theory and practice.

[15]  P. Yeates,et al.  Effect of exposure to good vs poor medical trainee performance on attending physician ratings of subsequent performances. , 2012, JAMA.

[16]  T. Gale,et al.  Development of workplace‐based assessments of non‐technical skills in anaesthesia * , 2012, Anaesthesia.

[17]  Karen Mann,et al.  Seeing the same thing differently , 2013, Advances in health sciences education : theory and practice.

[18]  Kevin W Eva,et al.  Exploring the impact of mental workload on rater-based assessments , 2012, Advances in Health Sciences Education.

[19]  P. Yeates,et al.  ‘You're certainly relatively competent’: assessor bias due to recent experiences , 2013, Medical education.

[20]  A. Muijtjens,et al.  Workplace-based assessment: raters’ performance theories and constructs , 2012, Advances in Health Sciences Education.