Recursive Partitioning vs Computerized Adaptive Testing to Reduce the Burden of Health Assessments in Cleft Lip and/or Palate: Comparative Simulation Study

Background Computerized adaptive testing (CAT) has been shown to deliver short, accurate, and personalized versions of the CLEFT-Q patient-reported outcome measure for children and young adults born with a cleft lip and/or palate. Decision trees may integrate clinician-reported data (eg, age, gender, cleft type, and planned treatments) to make these assessments even shorter and more accurate. Objective We aimed to create decision tree models incorporating clinician-reported data into adaptive CLEFT-Q assessments and compare their accuracy to traditional CAT models. Methods We used relevant clinician-reported data and patient-reported item responses from the CLEFT-Q field test to train and test decision tree models using recursive partitioning. We compared the prediction accuracy of decision trees to CAT assessments of similar length. Participant scores from the full-length questionnaire were used as ground truth. Accuracy was assessed through Pearson’s correlation coefficient of predicted and ground truth scores, mean absolute error, root mean squared error, and a two-tailed Wilcoxon signed-rank test comparing squared error. Results Decision trees demonstrated poorer accuracy than CAT comparators and generally made data splits based on item responses rather than clinician-reported data. Conclusions When predicting CLEFT-Q scores, individual item responses are generally more informative than clinician-reported data. Decision trees that make binary splits are at risk of underfitting polytomous patient-reported outcome measure data and demonstrated poorer performance than CATs in this study.

[1]  David Delgado-Gomez,et al.  Merged Tree-CAT: A fast method for building precise computerized adaptive tests based on decision trees , 2020, Expert Syst. Appl..

[2]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[3]  Kilchan Choi,et al.  Item Response Theory. , 2016 .

[4]  David Delgado-Gómez,et al.  Computerized adaptive test and decision trees: A unifying approach , 2019, Expert Syst. Appl..

[5]  S. French,et al.  Decision trees in epidemiological research , 2017, Emerging Themes in Epidemiology.

[6]  D Delgado-Gomez,et al.  Computerized Adaptive Test vs. decision trees: Development of a support decision system to identify suicidal behavior. , 2016, Journal of affective disorders.

[7]  A. Klassen,et al.  CLEFT-Q: Detecting Differences in Outcomes among 2434 Patients with Varying Cleft Types. , 2019, Plastic and reconstructive surgery.

[8]  Badih Ghattas,et al.  Computerized adaptive testing with decision regression trees: an alternative to item response theory for quality of life measurement in multiple sclerosis , 2018, Patient preference and adherence.

[9]  D. W. Cooper Adaptive testing , 1976, ICSE '76.

[10]  Miyong T Kim,et al.  An Introduction to Item Response Theory for Patient-Reported Outcome Measurement , 2014, The Patient - Patient-Centered Outcomes Research.

[11]  R. Philip Chalmers,et al.  Generating Adaptive and Non-Adaptive Test Interfaces for Multidimensional Item Response Theory Applications , 2016 .

[12]  Item reduction of the patient-rated wrist evaluation using decision tree modelling , 2020, Disability and rehabilitation.

[13]  David J. Weiss,et al.  Better Data From Better Measurements Using Computerized Adaptive Testing , 2011 .

[14]  R. Philip Chalmers,et al.  mirt: A Multidimensional Item Response Theory Package for the R Environment , 2012 .

[15]  J. Little,et al.  Cleft lip and palate , 2009, The Lancet.

[16]  A. Klassen,et al.  Psychometric findings and normative values for the CLEFT-Q based on 2434 children and young adult patients with cleft lip and/or palate from 12 countries , 2018, Canadian Medical Association Journal.

[17]  Hugh A. Chipman,et al.  Recursive Partitioning , 2011, International Encyclopedia of Statistical Science.

[18]  C. Rae,et al.  Further construct validation of the CLEFT-Q: Ability to detect differences in outcome for four cleft-specific surgeries. , 2019, Journal of plastic, reconstructive & aesthetic surgery : JPRAS.

[19]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[20]  A. Klassen,et al.  International multiphase mixed methods study protocol to develop a cross-cultural patient-reported outcome instrument for children and young adults with cleft lip and/or palate (CLEFT-Q) , 2017, BMJ Open.

[21]  J. Linacre,et al.  Sample size and item calibration stability , 1994 .

[22]  Item reduction of the Boston Carpal Tunnel Questionnaire using decision tree modelling. , 2019, Archives of physical medicine and rehabilitation.