Learning Optimal Individualized Treatment Rules from Electronic Health Record Data

Medical research is experiencing a paradigm shift from “one-size-fits-all” strategy to a precision medicine approach where the right therapy, for the right patient, and at the right time, will be prescribed. We propose a statistical method to estimate the optimal individualized treatment rules (ITRs) that are tailored according to subject-specific features using electronic health records (EHR) data. Our approach merges statistical modeling and medical domain knowledge with machine learning algorithms to assist personalized medical decision making using EHR. We transform the estimation of optimal ITR into a classification problem and account for the non-experimental features of the EHR data and confounding by clinical indication. We create a broad range of feature variables that reflect both patient health status and healthcare data collection process. Using EHR data collected at Columbia University clinical data warehouse, we construct a decision tree for choosing the best second line therapy for treating type 2 diabetes patients.

[1]  George Hripcsak,et al.  Exploiting time in electronic health record correlations , 2011, J. Am. Medical Informatics Assoc..

[2]  Stephen B. Johnson Model Formulation: Generic Data Modeling for Clinical Rrepositories , 1996, J. Am. Medical Informatics Assoc..

[3]  Paul Shekelle,et al.  Oral Pharmacologic Treatment of Type 2 Diabetes Mellitus: A Clinical Practice Guideline From the American College of Physicians , 2012, Annals of Internal Medicine.

[4]  S. Murphy,et al.  Methodological Challenges in Constructing Effective Treatment Sequences for Chronic Psychiatric Disorders , 2007, Neuropsychopharmacology.

[5]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Rae Woong Park,et al.  Characterizing treatment pathways at scale using the OHDSI network , 2016, Proceedings of the National Academy of Sciences.

[8]  Nisa M. Maruthur,et al.  Comparative Effectiveness and Safety of Medications for Type 2 Diabetes: An Update Including New Drugs and 2-Drug Combinations , 2011, Annals of Internal Medicine.

[9]  G Hripcsak,et al.  A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records , 2014, Applied Clinical Informatics.

[10]  Russ B. Altman,et al.  The utility of general purpose versus specialty clinical databases for research: Warfarin dose estimation from extracted clinical variables , 2010, J. Biomed. Informatics.

[11]  Noémie Elhadad,et al.  Identifying and mitigating biases in EHR laboratory tests , 2014, J. Biomed. Informatics.

[12]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[13]  D. A. Freedman,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models: Comment , 1999 .

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[15]  G. Ginsburg,et al.  The path to personalized medicine. , 2002, Current opinion in chemical biology.

[16]  Ree Dawson,et al.  Dynamic treatment regimes: practical design considerations , 2004, Clinical trials.

[17]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[18]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[19]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[20]  Joel H. Saltz,et al.  Temporal Abstraction-based Clinical Phenotyping with Eureka! , 2013, AMIA.

[21]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[22]  Michael R. Kosorok,et al.  Robust Hybrid Learning for Estimating Personalized Dynamic Treatment Regimens , 2016, 1611.02314.

[23]  R. Little SURVEY NONRESPONSE ADJUSTMENTS , 2002 .

[24]  Donglin Zeng,et al.  New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2015, Journal of the American Statistical Association.

[25]  Susan A Murphy,et al.  Customizing treatment to the patient: adaptive treatment strategies. , 2007, Drug and alcohol dependence.