An ensemble learning based framework to estimate warfarin maintenance dose with cross-over variables exploration on incomplete data set

MOTIVATION Warfarin is a widely used oral anticoagulant, but it is challenging to select the optimal maintenance dose due to its narrow therapeutic window and complex individual factor relationships. In recent years, machine learning techniques have been widely applied for warfarin dose prediction. However, the model performance always meets the upper limit due to the ignoration of exploring the variable interactions sufficiently. More importantly, there is no efficient way to resolve missing values when predicting the optimal warfarin maintenance dose. METHODS Using an observational cohort from the Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, we propose a novel method for warfarin maintenance dose prediction, which is capable of assessing variable interactions and dealing with missing values naturally. Specifically, we examine single variables by univariate analysis initially, and only statistically significant variables are included. We then propose a novel feature engineering method on them to generate the cross-over variables automatically. Their impacts are evaluated by stepwise regression, and only the significant ones are selected. Lastly, we implement an ensemble learning based approach, LightGBM, to learn from incomplete data directly on the selected single and cross-over variables for dosing prediction. RESULTS 377 unique patients with eligible and time-independent 1173 warfarin order events are included in this study. Through the comprehensive experimental results in 5-fold cross-validation, our proposed method demonstrates the efficiency of exploring the variable interactions and modeling on incomplete data. The R2 can achieve 75.0% on average. Moreover, the subgroup analysis results reveal that our method performs much better than other baseline methods, especially in the medium-dose and high-dose subgroups. Lastly, the IWPC dosing prediction model is used for further comparison, and our approach outperforms it by a significant margin. CONCLUSION In summary, our proposed method is capable of exploring the variable interactions and learning from incomplete data directly for warfarin maintenance dose prediction, which has a great premise and is worthy of further research.

[1]  T. Bohnert,et al.  Mechanism of Drug-Drug Interactions Between Warfarin and Statins. , 2016, Journal of pharmaceutical sciences.

[2]  Daniel M. Witt,et al.  Guidance for the practical management of warfarin therapy in the treatment of venous thromboembolism , 2016, Journal of Thrombosis and Thrombolysis.

[3]  Hang Xu,et al.  Comparison of the Performance of the Warfarin Pharmacogenetics Algorithms in Patients with Surgery of Heart Valve Replacement and Heart Valvuloplasty. , 2015, Thrombosis research.

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  M. Kinirons,et al.  Drug metabolism and ageing. , 2004, British journal of clinical pharmacology.

[6]  Jerzy W. Grzymala-Busse,et al.  A Comparison of Several Approaches to Missing Attribute Values in Data Mining , 2000, Rough Sets and Current Trends in Computing.

[7]  Christine W. Duarte,et al.  High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans , 2011, Bioinform..

[8]  Houshang Darabi,et al.  A New Approach towards Minimizing the Risk of Misdosing Warfarin Initiation Doses , 2018, Comput. Math. Methods Medicine.

[9]  Massimo Buscema,et al.  Prediction of optimal warfarin maintenance dose using advanced artificial neural networks. , 2014, Pharmacogenomics.

[10]  George Hripcsak,et al.  Caveats for the use of operational electronic health record data in comparative effectiveness research. , 2013, Medical care.

[11]  Houshang Darabi,et al.  Revisiting Warfarin Dosing Using Machine Learning Techniques , 2015, Comput. Math. Methods Medicine.

[12]  Gang Zhang,et al.  An Ensemble Learning Based Framework for Traditional Chinese Medicine Data Analysis with ICD-10 Labels , 2015, TheScientificWorldJournal.

[13]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[14]  Jie Chen,et al.  Warfarin maintenance dose Prediction for Patients undergoing heart valve replacement— a hybrid model with genetic algorithm and Back-Propagation neural network , 2018, Scientific Reports.

[15]  R. Akdemir,et al.  Use of neutrophil‐lymphocyte ratio for risk stratification and relationship with time in therapeutic range in patients with nonvalvular atrial fibrillation: A pilot study , 2018, Clinical cardiology.

[16]  Jacques Turgeon,et al.  Clinical Practice Recommendations on Genetic Testing of CYP2C9 and VKORC1 Variants in Warfarin Therapy , 2015, Therapeutic drug monitoring.

[17]  Ting Hsiang Lin,et al.  A comparison of multiple imputation with EM algorithm and MCMC method for quality of life missing data , 2010 .

[18]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[19]  R. Altman,et al.  Estimation of the warfarin dose with clinical and pharmacogenetic data. , 2009, The New England journal of medicine.

[20]  E N Jonsson,et al.  A PK–PD Model for Predicting the Impact of Age, CYP2C9, and VKORC1 Genotype on Individualization of Warfarin Therapy , 2007, Clinical pharmacology and therapeutics.

[21]  P. Makarawate,et al.  The impact of non-genetic and genetic factors on a stable warfarin dose in Thai patients , 2017, European Journal of Clinical Pharmacology.

[22]  Heather N. Watson,et al.  Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparability of EMR information to patient registry and health claims data , 2011, Clinical epidemiology.

[23]  Jie Chen,et al.  Use of artificial neural network to predict warfarin individualized dosage regime in Chinese patients receiving low-intensity anticoagulation after heart valve replacement. , 2014, International journal of cardiology.

[24]  Peter Wood,et al.  The impact of CYP2C9 and VKORC1 genetic polymorphism and patient characteristics upon warfarin dose requirements: proposal for a new dosing regimen. , 2005, Blood.

[25]  Wei Zhang,et al.  Comparison of the predictive abilities of pharmacogenetics-based warfarin dosing algorithms using seven mathematical models in Chinese patients. , 2015, Pharmacogenomics.

[26]  P. Deloukas,et al.  A Pharmacometric Model Describing the Relationship Between Warfarin Dose and INR Response With Respect to Variations in CYP2C9, VKORC1, and Age , 2010, Clinical pharmacology and therapeutics.

[27]  Koroush Khalighi,et al.  Clinical Model for Predicting Warfarin Sensitivity , 2019, Scientific Reports.

[28]  Jason C. Fish,et al.  Evidence-based management of anticoagulant therapy: Antithrombotic Therapy and Prevention of Thrombosis, 9th ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. , 2012, Chest.

[29]  M. Yalçın,et al.  Neutrophil–Lymphocyte Ratio May Predict Left Atrial Thrombus in Patients With Nonvalvular Atrial Fibrillation , 2015, Clinical and applied thrombosis/hemostasis : official journal of the International Academy of Clinical and Applied Thrombosis/Hemostasis.

[30]  P. Waterworth,et al.  Inpatient Oral Anticoagulation Management by Clinical Pharmacists: Safety and Cost effectiveness , 2010, Journal of clinical medicine research.

[31]  N. Takahashi,et al.  We should pay more attention to renal function before initiation of warfarin therapy. , 2015, Journal of cardiology.

[32]  F. Kamali,et al.  Contribution of age, body size, and CYP2C9 genotype to anticoagulant response to warfarin , 2004, Clinical pharmacology and therapeutics.

[33]  Josiah Poon,et al.  Attention-based Multi-instance Neural Network for Medical Diagnosis from Incomplete and Low Quality Data , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[34]  R. Giugliano,et al.  Oral Anticoagulation in Patients With Liver Disease. , 2018, Journal of the American College of Cardiology.

[35]  W. Ageno,et al.  Warfarin interactions with antibiotics in the ambulatory care setting. , 2014, JAMA internal medicine.

[36]  Munir Pirmohamed,et al.  A multi-factorial analysis of response to warfarin in a UK prospective cohort , 2016, Genome Medicine.

[37]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[38]  Ken P Kleinman,et al.  Much Ado About Nothing , 2007, The American statistician.

[39]  Feng Yu,et al.  Development of a novel individualized warfarin dose algorithm based on a population pharmacokinetic model with improved prediction accuracy for Chinese patients after heart valve replacement , 2017, Acta Pharmacologica Sinica.

[40]  M. Rieder,et al.  Use of Pharmacogenetic and Clinical Factors to Predict the Therapeutic Dose of Warfarin , 2008, Clinical pharmacology and therapeutics.