Consistency, inter-rater reliability, and validity of 441 consecutive mock oral examinations in anesthesiology: implications for use as a tool for assessment of residents.

BACKGROUND Oral practice examinations (OPEs) are used extensively in many anesthesiology programs for various reasons, including assessment of clinical judgment. Yet oral examinations have been criticized for their subjectivity. The authors studied the reliability, consistency, and validity of their OPE program to determine if it was a useful assessment tool. METHODS From 1989 through 1993, we prospectively studied 441 OPEs given to 190 residents. The examination format closely approximated that used by the American Board of Anesthesiology. Pass-fail grade and an overall numerical score were the OPE results of interest. Internal consistency and inter-rater reliability were determined using agreement measures. To assess their validity in describing competence, OPE results were correlated with in-training examination results and faculty evaluations. Furthermore, we analyzed the relationship of OPE with implicit indicators of resident preparation such as length of training. RESULTS The internal consistency coefficient for the overall numerical score was 0.82, indicating good correlation among component scores. The interexaminer agreement was 0.68, indicating moderate or good agreement beyond that expected by chance. The actual agreement among examiners on pass-fail was 84%. Correlation of overall numerical score with in-training examination scores and faculty evaluations was moderate (r = 0.47 and 0.41, respectively; P < 0.01). OPE results were significantly (P < 0.01) associated with training duration, previous OPE experience, trainee preparedness, and trainee anxiety. CONCLUSION Our results show the substantial internal consistency and reliability of OPE results at a single institution. The positive correlation of OPE scores with in-training examination scores, faculty evaluations, and other indicators of preparation suggest that OPEs are a reasonably valid tool for assessment of resident performance.

[1]  L. R. Evans,et al.  The reliability, validity, and taxonomic structure of the oral examination. , 1966, Journal of medical education.

[2]  A. Pokorny,et al.  An evaluation of oral examinations. , 1966, Journal of medical education.

[3]  C. F. Schumacher,et al.  Analysis of the oral examination of the American Board of Anesthesiology. , 1971, Journal of medical education.

[4]  G. Heninger,et al.  PHENOXYBENZAMINE IN ANOREXIA NERVOSA , 1976, The Lancet.

[5]  E. Siker A measure of competence The first Mushin lecture , 1976, Anaesthesia.

[6]  J. Nunnally Psychometric Theory (2nd ed), New York: McGraw-Hill. , 1978 .

[7]  D. Laube,et al.  Improvement of reliability of an oral examination by a structured evaluation instrument. , 1983, Journal of medical education.

[8]  S. Downing,et al.  The predictive validity of test formats and a psychometric theory of clinical competence. , 1984, Research in medical education : proceedings of the ... annual Conference. Conference on Research in Medical Education.

[9]  E. Hockman,et al.  Oral examinations: actual and perceived contributions to surgery clerkship performance. , 1985, Surgery.

[10]  J. Maatsch,et al.  An evaluation of the construct validity of four alternative theories of clinical competence. , 1986, Research in medical education : proceedings of the ... annual Conference. Conference on Research in Medical Education.

[11]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[12]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[13]  R. Jennrich,et al.  Unbalanced repeated-measures models with structured covariance matrices. , 1986, Biometrics.

[14]  R. Stillman Effect of prior clinical experience on students' knowledge and performance in surgery. , 1986, Surgery.

[15]  J. Benbassat,et al.  Reliability of global rating scales in the assessment of clinical competence of medical students , 1987, Medical education.

[16]  P. Darden,et al.  In-training examinations as predictors of resident clinical performance. , 1989, Pediatrics.

[17]  Reed G. Williams,et al.  Reliability of performance on standardized patient cases: A comparison of consistency measures based on generalizability theory , 1989 .

[18]  M. F. Rhoton A new method to evaluate clinical performance and critical incidents in anaesthesia: quantification of daily comments by teachers , 1990, Medical education.

[19]  in Psychiatry: Interexaminer Consistency , 1991 .

[20]  N. Coe,et al.  Influence of effective communication by surgery students on their oral examination scores , 1991, Academic medicine : journal of the Association of American Medical Colleges.

[21]  D. Witzke,et al.  The influences of student and standardized patient genders on scoring in an objective structured clinical examination. , 1991, Academic medicine : journal of the Association of American Medical Colleges.

[22]  M. Philbin,et al.  Assessment of clinical skills of residents utilizing standardized patients. A follow-up study and recommendations for application. , 1991, Annals of internal medicine.

[23]  A. Rothman,et al.  Validity and Generalizability of Global Ratings in an Objective Structured Clinical Examination , 1991, Academic medicine : journal of the Association of American Medical Colleges.

[24]  C. Bland,et al.  Comparing students' feedback about clinical instruction with their performances , 1991, Academic medicine : journal of the Association of American Medical Colleges.

[25]  D. Anastakis,et al.  The structured oral examination as a method for assessing surgical residents. , 1991, American journal of surgery.

[26]  Seidel Hm The role of National Board examinations in medical education. , 1992 .

[27]  H. Seidel The role of National Board examinations in medical education. , 1992, The Pharos of Alpha Omega Alpha-Honor Medical Society. Alpha Omega Alpha.

[28]  T. P. Wade,et al.  Evaluations of surgery resident performance correlate with success in board examinations. , 1993, Surgery.

[29]  R. Martineau,et al.  The oral examination in anaesthetic resident evaluation , 1993, Canadian journal of anaesthesia = Journal canadien d'anesthesie.

[30]  A S Elstein,et al.  Beyond multiple‐choice questions and essays: the need for a new way to assess clinical competence , 1993, Academic medicine : journal of the Association of American Medical Colleges.

[31]  W. Pope Anaesthesia oral examination , 1993, Canadian journal of anaesthesia = Journal canadien d'anesthesie.

[32]  Stawski Ws Evolution of a Mock Oral Board Examination Program in Surgery. , 1994 .

[33]  F. P. Hughes,et al.  A demonstration of validity for certification by the American Board of Anesthesiology , 1994, Academic medicine : journal of the Association of American Medical Colleges.

[34]  W. Stawski Evolution of a Mock Oral Board Examination Program in Surgery. , 1994, The American surgeon.

[35]  M. Donnelly,et al.  Assessing senior residents' knowledge and performance: an integrated evaluation program. , 1994, Surgery.

[36]  N. Coe,et al.  A surgery oral examination: interrater agreement and the influence of rater characteristics , 1995, Academic medicine : journal of the Association of American Medical Colleges.

[37]  J. Colliver Validation of standardized-patient assessment: a meaning for clinical competence. , 1995, Academic medicine : journal of the Association of American Medical Colleges.

[38]  K. Matherlee The outlook for clinical research: impacts of federal funding restraint and private sector reconfiguration. , 1995 .

[39]  C. Griffith,et al.  Do housestaff interview standardized patients differently from real ones? , 1995, Academic medicine : journal of the Association of American Medical Colleges.

[40]  R. Friedenberg Qualifying examinations: are they a measure of competence? , 1995, Radiology.

[41]  J. Hoff,et al.  Assessment of training progress and examinations. , 1997, Acta neurochirurgica. Supplement.

[42]  M. Tan,et al.  Model diagnostics for marginal regression analysis of correlated binary data , 1997 .

[43]  E. Mascha,et al.  ORGANIZATION OF A COMPREHENSIVE ANESTHESIOLOGY ORAL PRACTICE EXAMINATION PROGRAM: Planning, Structure, Startup, Administration, Growth and Evaluation. , 1999, The journal of education in perioperative medicine : JEPM.