Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes

ABSTRACT Estimating an optimal individualized treatment rule (ITR) based on patients’ information is an important problem in precision medicine. An optimal ITR is a decision function that optimizes patients’ expected clinical outcomes. Many existing methods in the literature are designed for binary treatment settings with the interest of a continuous outcome. Much less work has been done on estimating optimal ITRs in multiple treatment settings with good interpretations. In this article, we propose angle-based direct learning (AD-learning) to efficiently estimate optimal ITRs with multiple treatments. Our proposed method can be applied to various types of outcomes, such as continuous, survival, or binary outcomes. Moreover, it has an interesting geometric interpretation on the effect of different treatments for each individual patient, which can help doctors and patients make better decisions. Finite sample error bounds have been established to provide a theoretical guarantee for AD-learning. Finally, we demonstrate the superior performance of our method via an extensive simulation study and real data applications. Supplementary materials for this article are available online.

[1]  Eric B. Laber,et al.  A Robust Method for Estimating Optimal Treatment Regimes , 2012, Biometrics.

[2]  Lu Tian,et al.  A Simple Method for Detecting Interactions between a Treatment and a Large Number of Covariates , 2012, 1212.2995.

[3]  Tianxi Cai,et al.  A general statistical framework for subgroup identification and comparative treatment scoring , 2017, Biometrics.

[4]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[5]  Wenbin Lu,et al.  On estimation of optimal treatment regimes for maximizing t‐year survival probability , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[6]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[7]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[8]  Michael R. Kosorok,et al.  Robust Hybrid Learning for Estimating Personalized Dynamic Treatment Regimens , 2016, 1611.02314.

[9]  K. Lange,et al.  Multicategory vertex discriminant analysis for high-dimensional data , 2010, 1101.0952.

[10]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[11]  Wenbin Lu,et al.  Optimal treatment regimes for survival endpoints using a locally-efficient doubly-robust estimator from a classification perspective , 2017, Lifetime data analysis.

[12]  A. Tsybakov,et al.  Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[13]  Donglin Zeng,et al.  New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2015, Journal of the American Statistical Association.

[14]  Susan A. Murphy,et al.  A Generalization Error for Q-Learning , 2005, J. Mach. Learn. Res..

[15]  Eric B. Laber,et al.  Doubly Robust Learning for Estimating Individualized Treatment with Censored Data. , 2015, Biometrika.

[16]  Marie Davidian,et al.  Using decision lists to construct interpretable and parsimonious treatment regimes , 2015, Biometrics.

[17]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[18]  Eric B. Laber,et al.  Tree-based methods for individualized treatment regimes. , 2015, Biometrika.

[19]  S. Hammer,et al.  A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. AIDS Clinical Trials Group Study 175 Study Team. , 1996, The New England journal of medicine.

[20]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[21]  A. Tsybakov,et al.  Fast learning rates for plug-in classifiers , 2005, 0708.2321.

[22]  I. Boutron,et al.  Reporting of analyses from randomized controlled trials with multiple arms: a systematic review , 2013, BMC Medicine.

[23]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[24]  Wenbin Lu,et al.  Concordance‐assisted learning for estimating optimal individualized treatment regimes , 2017, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[25]  Jian Huang,et al.  Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors , 2012, Statistics and Computing.

[26]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[27]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[28]  T. Kakuda,et al.  Pharmacology of nucleoside and nucleotide reverse transcriptase inhibitor-induced mitochondrial toxicity. , 2000, Clinical therapeutics.

[29]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[30]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[31]  Jerome H. FriedmanyNovember,et al.  Predicting Multivariate Responses in , 2013 .

[32]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[33]  J. M. Taylor,et al.  Subgroup identification from randomized clinical trial data , 2011, Statistics in medicine.

[34]  Yifan Cui,et al.  Tree based weighted learning for estimating individualized treatment rules with censored data. , 2017, Electronic journal of statistics.

[35]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[36]  Yufeng Liu,et al.  Multicategory angle-based large-margin classification. , 2014, Biometrika.

[37]  Michael R Kosorok,et al.  Residual Weighted Learning for Estimating Individualized Treatment Rules , 2015, Journal of the American Statistical Association.

[38]  Wenbin Lu,et al.  Variable selection for optimal treatment decision , 2013, Statistical methods in medical research.

[39]  M. Kosorok,et al.  Q-LEARNING WITH CENSORED DATA. , 2012, Annals of statistics.

[40]  Yufeng Liu,et al.  D-learning to estimate optimal individual treatment rules , 2018 .

[41]  Lu Wang,et al.  Adaptive contrast weighted learning for multi‐stage multi‐treatment decision‐making , 2017, Biometrics.

[42]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .