Optimal stratification in outcome prediction using baseline information

A common practice in predictive medicine is to use current study data to construct a stratification procedure, which groups subjects according to baseline information and forms stratum-specific prevention or intervention strategies. A desirable stratification scheme would not only have small intra-stratum variation but also have a clinically meaningful discriminatory capability. We show how to obtain optimal stratification rules with such desirable properties from fitting a set of regression models relating the outcome to baseline covariates and creating scoring systems for predicting potential outcomes. We propose that all available optimal stratifications be evaluated with an independent dataset to select a final stratification. Lastly, we obtain inferential results for this selected stratification scheme with a holdout dataset. When only one study of moderate size is available, we combine the first two steps via crossvalidation. Extensive simulation studies are used to compare the proposed stratification strategy with alternatives. We illustrate the new proposal using an AIDS clinical trial for binary outcomes and a cardiovascular clinical study for censored event time outcomes.

[1]  Lihui Zhao,et al.  A predictive enrichment procedure to identify potential responders to a new therapy for randomized, comparative controlled clinical studies. , 2016, Biometrics.

[2]  Lihui Zhao,et al.  Predicting the restricted mean event time with the subject's baseline covariates in survival analysis. , 2014, Biostatistics.

[3]  Lu Tian,et al.  Classical Model Selection , 2013 .

[4]  Lu Tian,et al.  Effectively Selecting a Target Population for a Future Comparative Study , 2013, Journal of the American Statistical Association.

[5]  Patrick Royston,et al.  A simulation study of predictive ability measures in a survival model I: Explained variation measures , 2012, Statistics in medicine.

[6]  P Royston,et al.  A simulation study of predictive ability measures in a survival model II: explained randomness and predictive accuracy , 2012, Statistics in medicine.

[7]  P. Royston,et al.  The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt , 2011, Statistics in medicine.

[8]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[9]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[10]  Hamdy A. Taha,et al.  Operations research: an introduction / Hamdy A. Taha , 1982 .

[11]  S. Solomon,et al.  Renal Function and Effectiveness of Angiotensin-Converting Enzyme Inhibitor Therapy in Patients With Chronic Stable Coronary Disease in the Prevention of Events with ACE inhibition (PEACE) Trial , 2006, Circulation.

[12]  Patrick Royston,et al.  Explained Variation for Survival Models , 2006 .

[13]  M. Pfeffer,et al.  Angiotensin-converting-enzyme inhibition in stable coronary artery disease. , 2004, The New England journal of medicine.

[14]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[15]  Eyton,et al.  A CONTROLLED TRIAL OF TWO NUCLEOSIDE ANALOGUES PLUS INDINAVIR IN PERSONS WITH HUMAN IMMUNODEFICIENCY VIRUS INFECTION AND CD4 CELL COUNTS OF 200 PER CUBIC MILLIMETER OR LESS , 2000 .

[16]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[17]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[18]  David M. Zucker,et al.  Restricted Mean Life with Covariates: Modification and Extension of a Useful Survival Analysis Method , 1998 .

[19]  M A Fischl,et al.  A controlled trial of two nucleoside analogues plus indinavir in persons with human immunodeficiency virus infection and CD4 cell counts of 200 per cubic millimeter or less. AIDS Clinical Trials Group 320 Study Team. , 1997, The New England journal of medicine.

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  Norman Breslow,et al.  Discussion of Professor Cox''s paper , 1974 .

[22]  D. Cox Regression Models and Life-Tables , 1972 .

[23]  D.,et al.  Regression Models and Life-Tables , 2022 .