Assuring validity of multisource feedback in a national programme

Objective To report the evidence for and challenges to the validity of Sheffield Peer Review Assessment Tool (SPRAT) with paediatric Specialist Registrars (SpRs) across the UK as part of Royal College of Paediatrics and Child Health workplace based assessment programme. Design Quality assurance analysis, including generalisability, of a multisource feedback questionnaire study. Setting All UK Deaneries between August 2005 and May 2006. Participants 577 year 2 and 4 Paediatric SpRs. Interventions Trainees were evaluated using SPRAT sent to clinical colleagues of their choosing. Data were analysed reporting totals, means and SD, and year groups were compared using independent t tests. A factor analysis was undertaken. Reliability was estimated using generalisability theory. Trainee and assessor demographic details were explored to try to explain variability in scores. Main outcome measures 4770 SPRAT assessments were provided about 577 paediatric SpRs. The mean scores between years were significantly different (Year 2 mean=5.08, SD=0.34, Year 4 mean=5.18, SD=0.34). A factor analysis returned a two-factor solution, clinical care and psychosocial skills. The 95% CI showed that trainees scoring ≥4.3 with nine assessors can be seen as achieving satisfactory performance with statistical confidence. Consultants marked trainees significantly lower (t=−4.52) whereas Senior House Officers and Foundation doctors scored their SpRs significantly higher (SHO t=2.06, Foundation t=2.77). Conclusions There is increasing evidence that multisource feedback (MSF) assesses two generic traits, clinical care and psychosocial skills. The validity of MSF is threatened by systematic bias, namely leniency bias and the seniority of assessors. Unregulated self-selection of assessors needs to end.

[1]  K. Kraiger,et al.  Study of race effects in objective indices and subjective evaluations of performance: A meta-analysis of performance criteria. , 1986 .

[2]  J. Archer,et al.  Assessment of doctors' consultation skills in the paediatric setting: the Paediatric Consultation Assessment Tool , 2008, Archives of Disease in Childhood.

[3]  C. Eiser,et al.  Children and their parents assessing the doctor–patient interaction: a rating system for doctors' communication skills , 2005, Medical education.

[4]  J. Conway,et al.  Analysis and Design of Multitrait-Multirater Performance Appraisal Studies , 1996 .

[5]  E. Thorndike A constant error in psychological ratings. , 1920 .

[6]  Heejoon Park,et al.  The relationship between rater affect and three sources of 360-degree feedback ratings , 2001 .

[7]  D. Ilgen,et al.  The Effects of Ratee Prototypicality on Rater Observation and Accuracy1 , 1989 .

[8]  William C McGaghie,et al.  SPECIAL ARTICLE: Cognitive, Social and Environmental Sources of Bias in Clinical Performance Ratings , 2003, Teaching and learning in medicine.

[9]  G. Elwyn,et al.  Learning in practice Review of instruments for peer assessment of physicians , 2004 .

[10]  L W T Schuwirth,et al.  Selecting performance assessment methods for experienced physicians , 2002, Medical education.

[11]  J. Conway,et al.  Distinguishing contextual performance from task performance for managerial jobs. , 1999 .

[12]  Angelo S. DeNisi,et al.  A closer look at interpersonal affect as a distinct influence on cognitive processing in performance evaluations. , 1994 .

[13]  J. Carline,et al.  Use of peer ratings to evaluate physician performance. , 1993, JAMA.

[14]  Jack M. Feldman,et al.  Beyond Attribution Theory: Cognitive Processes in Performance Appraisal , 1981 .

[15]  Robert L. Holzbach,et al.  Rater bias in performance ratings: Superior, self-, and peer ratings. , 1978 .

[16]  London,et al.  General Medical Council , 1920 .

[17]  Walter C. Borman,et al.  Effects of instructions to avoid halo error on reliability and validity of performance evaluation ratings. , 1975 .

[18]  Walter C. Borman,et al.  The rating of individuals in organizations: An alternate approach , 1974 .

[19]  Jeff W. Johnson,et al.  The relative importance of task and contextual performance dimensions to supervisor judgments of overall performance. , 2001, The Journal of applied psychology.

[20]  Bruce Thompson,et al.  Score Reliability in Webor Internet-Based Surveys: Unnumbered Graphic Rating Scales versus Likert-Type Scales , 2001 .

[21]  A. H. Church,et al.  A five-phase framework for designing a successful multisource feedback system. , 2001 .

[22]  Angelo S. DeNisi,et al.  A cognitive view of the performance appraisal process: A model and research propositions , 1984 .

[23]  Peter Villanova,et al.  Stability of Rater Leniency: Three Studies , 1995 .

[24]  Leanne E. Atwater,et al.  Personal Attributes as Predictors of Superiors' and Subordinates' Perceptions of Military Academy Leadership , 1993 .

[25]  J. Archer,et al.  Use of SPRAT for peer review of paediatricians in training , 2005, BMJ : British Medical Journal.

[26]  J. Norcini,et al.  Peer assessment of competence , 2003, Medical education.

[27]  Shunzo Koizumi,et al.  Medical professionalism in the new millennium: a physician charter. , 2002, Obstetrics and gynecology.

[28]  Alberto Malliani,et al.  Medical professionalism in the new millennium: a physician charter. , 2002, Annals of internal medicine.

[29]  H. J. Bernardin,et al.  Conscientiousness and agreeableness as predictors of rating leniency. , 2000, The Journal of applied psychology.

[30]  R. Saavedra,et al.  Peer evaluation in self-managing work groups. , 1993 .

[31]  Walter C. Borman,et al.  Personal constructs, performance schemata, and “folk theories” of subordinate effectiveness: Explorations in an army officer sample☆ , 1987 .

[32]  H. Kimball,et al.  Medical Professionalism in the New Millennium: A Physician Charter 15 Months Later , 2003, Annals of Internal Medicine.

[33]  C. Carver,et al.  Attention and Self-Regulation: A Control-Theory Approach to Human Behavior , 1981 .

[34]  Questionnaire construction and question writing for research in medical education. , 1988, Medical education.

[35]  Kurt Kraiger,et al.  A meta-analysis of ratee race effects in performance ratings. , 1985 .

[36]  Julian Archer,et al.  mini-PAT (Peer Assessment Tool): A Valid Component of a National Assessment Programme in the UK? , 2008, Advances in health sciences education : theory and practice.