Machine-learning-based predictions of direct-acting antiviral therapy duration for patients with hepatitis C

INTRODUCTION Hepatitis C, which affects 71 million persons worldwide, is the most common blood-borne pathogen in the United States. Chronic infections can be treated effectively thanks to the availability of modern direct-acting antiviral (DAA) therapies. Real-world data on the duration of DAA therapy, which can be used to optimize and guide the course of therapy, may also be useful in determining quality of life enhancements based upon total required supply of medication and long-term improvements to quality of life. We developed a machine learning model to identify patient characteristics associated with prolonged DAA treatment duration. METHODS A nationwide U.S. commercial managed care plan with claims data that covers about 60 million beneficiaries from 2009 to 2019 were used in the retrospective study. We examined differences in age, gender, and multiple comorbidities among patients treated with different durations of DAA treatment. We also examined the performance of machine learning models for predicting a prolonged course of DAA based on the area under the receiver operating characteristic curve (AUC). RESULTS We identified 3943 cases with hepatitis C who received sofosbuvir/ledipasvir as the first course of DAA and were eligible for the study. Patients receiving prolonged treatment (n = 240, 6.1%) were more likely to have compensated cirrhosis, decompensated cirrhosis, and other comorbidities (P < 0.001). For distinguishing patients who received prolonged DAA treatment for hepatitis C from patients received standard treatment, the optimal predictive model, constructed with XGBoost, had an AUC of 0.745 ± 0.031 (P < 0.001). CONCLUSIONS The risk of antiviral resistance and the cost of DAA are strong motivators to ensure that first-round DAA therapy is effective. For the dominant DAA treatment during the course of this analysis, we present a model that identifies factors already captured in established guidelines and adds to those age, comorbidity burden, and type 2 diabetes status; patient characteristics that are predictive of extended treatment.