A Short Note on Optimizing Cost-Generalizability via a Machine-Learning Approach

The costs of an objective structured clinical examination (OSCE) are of concern to health profession educators globally. As OSCEs are usually designed under generalizability theory (G-theory) framework, this article proposes a machine-learning-based approach to optimize the costs, while maintaining the minimum required generalizability coefficient, a reliability-like index in G-theory. The authors adopted G-theory parameters yielded from an OSCE hosted by a medical school, reproduced the generalizability coefficients to prepare for optimizing manipulations, applied simulated annealing algorithm to calculate the number of facet levels minimizing the associated costs, and conducted the analysis in various conditions via computer simulation. With a given generalizability coefficient, the proposed approach, virtually an instrument of decision-making supports, found the optimal solution for the OSCE such that the associated costs were minimized. The computer simulation results showed how the cost reductions varied with different levels of required generalizability coefficients. Machine learning–based approaches can be used in conjunction with psychometric modeling to help planning assessment tasks more scientifically. The proposed approach is easy to adopt into practice and customize in alignment with specific testing designs. While these results are encouraging, the possible pitfalls such as algorithmic convergences’ failure and inadequate cost assumptions should also be avoided.

[1]  Kieran Walsh,et al.  Cost and value in medical education , 2013, Education for Primary Care.

[2]  George A. Marcoulides,et al.  Selecting the Number of Observations in Multivariate Measurement Studies Under Budget Constraints , 1991 .

[3]  Zhehan Jiang,et al.  Using a linear mixed-effect model framework to estimate multivariate generalizability theory parameters in R , 2020, Behavior research methods.

[4]  Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory , 2012, BMC medical education.

[5]  Z. Goldstein,et al.  The Optimization of Generalizability Studies with Resource Constraints , 1990 .

[6]  I. Koren,et al.  Objective structured clinical evaluation of clinical competence: an integrative review. , 2009, Journal of advanced nursing.

[7]  Ö.H. Bettemir Experimental Design for Genetic Algorithm Simulated Annealing for Time Cost Trade-off Problems , 2011 .

[8]  R. Hambleton,et al.  Quality Assurance Methods for Performance-Based Assessments , 2003, Advances in health sciences education : theory and practice.

[9]  M. Julião,et al.  Is the OSCE a feasible tool to assess competencies in undergraduate medical education? , 2013, Medical teacher.

[10]  Ou Ziqiang,et al.  Estimation of variance and covariance components , 1989 .

[11]  Valerie Smith,et al.  The Objective Structured Clinical Examination (OSCE) as a strategy for assessing clinical competence in midwifery education in Ireland: a critical review. , 2012, Nurse education in practice.

[12]  Piotr Jaśkowski,et al.  Scheduling Construction Projects Using Evolutionary Algorithm , 2006 .

[13]  Steven M Downing,et al.  Validity threats: overcoming interference with proposed interpretations of assessment data , 2004, Medical education.

[14]  William L. Goffe,et al.  SIMANN: FORTRAN module to perform Global Optimization of Statistical Functions with Simulated Annealing , 1992 .

[15]  T. Haladyna,et al.  Construct-Irrelevant Variance in High-Stakes Testing. , 2005 .

[16]  Winny Setyonugroho,et al.  Reliability and validity of OSCE checklists used to assess the communication skills of undergraduate medical students: A systematic review. , 2015, Patient education and counseling.

[17]  George A. Marcoulides,et al.  The Optimization of Multivariate Generalizability Studies with Budget Constraints , 1992 .

[18]  S. Downing Validity: on the meaningful interpretation of assessment data , 2003, Medical education.

[19]  Andrew J. Mashburn,et al.  A Practical Solution to Optimizing the Reliability of Teaching Observation Measures Under Budget Constraints , 2014 .

[20]  R. Reznick,et al.  A comparative analysis of the costs of administration of an OSCE (objective structured clinical examination) , 1994, Academic medicine : journal of the Association of American Medical Colleges.

[21]  D. Newble,et al.  Psychometric characteristics of the objective structured clinical examination , 1988, Medical education.

[22]  Calyampudi R. Rao Estimation of Heteroscedastic Variances in Linear Models , 1970 .

[23]  J. Tukey,et al.  AVERAGE VALUES OF MEAN SQUARES IN FACTORIALS , 1956 .

[24]  P. F. Sanders,et al.  Minimizing the number of observations: A generalization of the spearman-brown formula , 1989 .

[25]  George A. Marcoulides,et al.  Maximizing the Coefficient of Generalizability in Decision Studies , 1991 .

[26]  I. Méndez-Ramírez,et al.  Reliability analysis of the objective structured clinical examination using generalizability theory , 2016, Medical education online.

[27]  R. Harden,et al.  Assessment of clinical competence using objective structured examination. , 1975, British medical journal.

[28]  K. Walsh,et al.  Money makes the (medical assessment) world go round: The cost of components of a summative final year Objective Structured Clinical Examination (OSCE) , 2015, Medical teacher.

[29]  G. Regehr,et al.  The Psychiatry OSCE: A 20-Year Retrospective , 2014, Academic Psychiatry.

[30]  R. Brennan Generalizability Theory and Classical Test Theory , 2010 .

[31]  W. Nobnop,et al.  Quality assurance. , 1998, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[32]  E. Paolucci,et al.  A generalizability study of the medical judgment vignettes interview to assess students' noncognitive attributes for medical school , 2008, BMC medical education.

[33]  Nigel O'Brian,et al.  Generalizability Theory I , 2003 .

[34]  Zhehan Jiang,et al.  A Bayesian approach to estimating variance components within a multivariate generalizability theory framework , 2017, Behavior Research Methods.

[35]  Patrick Siarry,et al.  A survey on optimization metaheuristics , 2013, Inf. Sci..

[36]  Aaron Klein,et al.  Hyperparameter Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[37]  J. A. Woodward,et al.  Maximizing the coefficient of generalizability in multi-facet decision studies , 1973 .

[38]  George A. Marcoulides,et al.  Maximizing Power in Generalizability Studies Under Budget Constraints , 1993 .

[39]  B. Egan,et al.  Designing and Implementing the Objective Structured Clinical Examination in Anesthesiology , 2014, Anesthesiology.

[40]  Kai Husmann,et al.  The R Package optimization : Flexible Global Optimization with Simulated-Annealing , 2017 .

[41]  George A. Marcoulides,et al.  Designing Measurement Studies Under Budget Constraints: Controlling Error of Measurement and Power , 1995 .

[42]  Eva Nick,et al.  The dependability of behavioral measurements: theory of generalizability for scores and profiles , 1973 .

[43]  D Osterweil,et al.  [Objective structured clinical examination]. , 1981, Harefuah.

[44]  Colin R. Reeves,et al.  A genetic algorithm for flowshop sequencing , 1995, Comput. Oper. Res..

[45]  S. Messick Test Validity: A Matter of Consequence , 1998 .

[46]  P. F. Sanders,et al.  Alternative solutions for optimization problems in generalizability theory , 1992 .

[47]  Alan Huebner,et al.  Generalizability Theory in R , 2019 .

[48]  Edward W Wolfe,et al.  Detecting and measuring rater effects using many-facet Rasch measurement: part I. , 2003, Journal of applied measurement.

[49]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[50]  Chang Li,et al.  A Simulated Annealing Algorithm for D-Optimal Design for 2-Way and 3-Way Polynomial Regression with Correlated Observations , 2014, J. Appl. Math..

[51]  Nathan T. Carter,et al.  Updating Generalizability Theory in Management Research , 2015 .

[52]  Prediction of the herd somatic cell count of the following month using a linear mixed effect model. , 2010, Journal of dairy science.