Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project

OBJECTIVE Claims-based algorithms are used in the Food and Drug Administration Sentinel Active Risk Identification and Analysis System to identify occurrences of health outcomes of interest (HOIs) for medical product safety assessment. This project aimed to apply machine learning classification techniques to demonstrate the feasibility of developing a claims-based algorithm to predict an HOI in structured electronic health record (EHR) data. MATERIALS AND METHODS We used the 2015-2019 IBM MarketScan Explorys Claims-EMR Data Set, linking administrative claims and EHR data at the patient level. We focused on a single HOI, rhabdomyolysis, defined by EHR laboratory test results. Using claims-based predictors, we applied machine learning techniques to predict the HOI: logistic regression, LASSO (least absolute shrinkage and selection operator), random forests, support vector machines, artificial neural nets, and an ensemble method (Super Learner). RESULTS The study cohort included 32 956 patients and 39 499 encounters. Model performance (positive predictive value [PPV], sensitivity, specificity, area under the receiver-operating characteristic curve) varied considerably across techniques. The area under the receiver-operating characteristic curve exceeded 0.80 in most model variations. DISCUSSION For the main Food and Drug Administration use case of assessing risk of rhabdomyolysis after drug use, a model with a high PPV is typically preferred. The Super Learner ensemble model without adjustment for class imbalance achieved a PPV of 75.6%, substantially better than a previously used human expert-developed model (PPV = 44.0%). CONCLUSIONS It is feasible to use machine learning methods to predict an EHR-derived HOI with claims-based predictors. Modeling strategies can be adapted for intended uses, including surveillance, identification of cases for chart review, and outcomes research.

[1]  Joseph Varon,et al.  Bench-to-bedside review: Rhabdomyolysis – an overview for clinicians , 2004, Critical care.

[2]  R. Rosenson,et al.  An assessment by the Statin Muscle Safety Task Force: 2014 update. , 2014, Journal of clinical lipidology.

[3]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  P. Jorens,et al.  An observational study on rhabdomyolysis in the intensive care unit. Exploring its risk factors and main complication: acute kidney injury , 2013, Annals of Intensive Care.

[5]  R. Wanders,et al.  Rhabdomyolysis: a review of the literature , 1993, Clinical Neurology and Neurosurgery.

[6]  G. Dimeski,et al.  Rhabdomyolysis: Patterns, Circumstances, and Outcomes of Patients Presenting to the Emergency Department , 2018, Ochsner Journal.

[7]  A. Mishra,et al.  Acute Renal Failure Due to Rhabdomyolysis Following a Seizure , 2013, Journal of family medicine and primary care.

[8]  Richard Platt,et al.  Incidence of hospitalized rhabdomyolysis in patients treated with lipid-lowering drugs. , 2004, JAMA.

[9]  J. Varon,et al.  Beyond muscle destruction: a systematic review of rhabdomyolysis for clinical practice , 2016, Critical Care.

[10]  A. S. Laurence Serum myoglobin and creatine kinase following surgery. , 2000, British journal of anaesthesia.

[11]  Priscilla M. Clarkson,et al.  Perspectives on Exertional Rhabdomyolysis , 2017, Sports Medicine.

[12]  N. Petejová,et al.  Acute kidney injury due to rhabdomyolysis and renal replacement therapy: a critical review , 2014, Critical Care.

[13]  David L Buckeridge,et al.  Can Hyperparameter Tuning Improve the Performance of a Super Learner? , 2019, Epidemiology.

[14]  A. Soylu,et al.  Rhabdomyolysis with different etiologies in childhood , 2017, World journal of clinical pediatrics.

[15]  Bruce M Psaty,et al.  Use of administrative data to estimate the incidence of statin-related rhabdomyolysis. , 2012, JAMA.

[16]  N. Malathi,et al.  Diagnostic markers of acute myocardial infarction (Review) , 2015 .

[17]  Richard Platt,et al.  The FDA Sentinel Initiative - An Evolving National Resource. , 2018, The New England journal of medicine.

[18]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[19]  Tina Hernandez-Boussard,et al.  Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models. , 2018, Annual review of biomedical data science.

[20]  S. H. Regli,et al.  Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer , 2019, JAMA network open.

[21]  R. Davenport,et al.  How to do it: investigate exertional rhabdomyolysis (or not) , 2018, Practical Neurology.

[22]  G. Mann,et al.  Rhabdomyolysis. The role of diagnostic and prognostic factors. , 2013, Muscles, ligaments and tendons journal.

[23]  L. Low,et al.  Association of aspartate aminotransferase in statin-induced rhabdomyolysis. , 2017, Journal of primary health care.

[24]  Y. Oshima Characteristics of drug-associated rhabdomyolysis: analysis of 8,610 cases reported to the U.S. Food and Drug Administration. , 2011, Internal medicine.

[25]  A. Kaye,et al.  Rhabdomyolysis: pathogenesis, diagnosis, and treatment. , 2015, Ochsner Journal.

[26]  A. Mammen,et al.  Diagnostic evaluation of rhabdomyolysis , 2015, Muscle & nerve.

[27]  George Hripcsak,et al.  Adapting electronic health records-derived phenotypes to claims data: Lessons learned in using limited clinical data for phenotyping , 2019, J. Biomed. Informatics.

[28]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[29]  D. Voora,et al.  Phenotype Standardization for Statin-Induced Myotoxicity , 2014, Clinical pharmacology and therapeutics.

[30]  J. Veenstra,et al.  Relationship between elevated creatine phosphokinase and the clinical spectrum of rhabdomyolysis. , 1994, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.