Integrating landmark modeling framework and machine learning algorithms for dynamic prediction of tuberculosis treatment outcomes

OBJECTIVE This study aims to establish an informative dynamic prediction model of treatment outcomes using follow-up records of tuberculosis (TB) patients, which can timely detect cases when the current treatment plan may not be effective. MATERIALS AND METHODS We used 122 267 follow-up records from 17 958 new cases of pulmonary TB in the Republic of Moldova. A dynamic prediction framework integrating landmark modeling and machine learning algorithms was designed to predict patient outcomes during the course of treatment. Sensitivity and positive predictive value (PPV) were calculated to evaluate performance of the model at critical time points. New measures were defined to determine when follow-up laboratory tests should be conducted to obtain most informative results. RESULTS The random-forest algorithm performed better than support vector machine and penalized multinomial logistic regression models for predicting TB treatment outcomes. For all 3 outcome classes (ie, cured, not cured, and died after 24 months following treatment initiation), sensitivity and PPV of prediction models improved as more follow-up information was collected. Specifically, sensitivity and PPV increased from 0.55 to 0.84 and from 0.32 to 0.88, respectively, for the not cured class. CONCLUSION The dynamic prediction framework utilizes longitudinal laboratory test results to predict patient outcomes at various landmarks. Sputum culture and smear results are among the important variables for prediction; however, the most recent sputum result is not always the most informative one. This framework can potentially facilitate a more effective treatment monitoring program and provide insights for policymakers toward improved guidelines on follow-up tests.