Stagewise pseudo‐value regression for time‐varying effects on the cumulative incidence

In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event status is replaced by a jackknife pseudo-value based on the Aalen-Johansen method. We combine a stagewise regression technique with the pseudo-value approach to provide variable selection while allowing for time-varying effects. This is implemented by coupling variable selection between the grid times, but determining estimates separately. The effect estimates are regularized to also allow for model fitting with a low to moderate number of observations. This technique is illustrated in an application using clinical cancer registry data from hepatocellular carcinoma patients. The results are contrasted with traditional hazard-based modeling. In addition to a more straightforward interpretation, when using the proposed technique, the identification of time-varying effect patterns on the cumulative incidence is seen to be feasible with a moderate number of observations.

[1]  Harald Binder,et al.  Boosting for high-dimensional time-to-event data with competing risks , 2009, Bioinform..

[2]  Michael Schemper,et al.  Parsimonious analysis of time‐dependent effects in the Cox model , 2007, Statistics in medicine.

[3]  Thomas A Gerds,et al.  Pseudo-observations for competing risks with covariate dependent censoring , 2014, Lifetime data analysis.

[4]  Juliane Schäfer,et al.  Dynamic Cox modelling based on fractional polynomials: time‐variations in gastric cancer prognosis , 2003, Statistics in medicine.

[5]  Ludwig Fahrmeir,et al.  Smoothing Hazard Functions and Time-Varying Effects in Discrete Duration and Competing Risks Models , 1996 .

[6]  Aris Perperoglou,et al.  Reduced‐rank hazard regression for modelling non‐proportional hazards , 2006, Statistics in medicine.

[7]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[8]  Michael J Thun,et al.  Overweight, obesity, and mortality from cancer in a prospectively studied cohort of U.S. adults. , 2003, The New England journal of medicine.

[9]  A. Zwinderman,et al.  Statistical Applications in Genetics and Molecular Biology Quantifying the Association between Gene Expressions and DNA-Markers by Penalized Canonical Correlation Analysis , 2011 .

[10]  Thomas A. Gerds,et al.  On functional misspecification of covariates in the Cox regression model , 2001 .

[11]  Maja Pohar Perme,et al.  Pseudo-observations in survival analysis , 2010, Statistical methods in medical research.

[12]  Harald Binder,et al.  A general, prediction error‐based criterion for selecting model complexity for high‐dimensional survival models , 2010, Statistics in medicine.

[13]  Mei-Jie Zhang,et al.  Predicting cumulative incidence probability by direct binomial regression , 2008 .

[14]  Aris Perperoglou,et al.  Cox models with dynamic ridge penalties on time‐varying effects of the covariates , 2014, Statistics in medicine.

[15]  Martin Schumacher,et al.  Proportional subdistribution hazards modeling offers a summary analysis, even if misspecified , 2010, Statistics in medicine.

[16]  Ralf Bender,et al.  Generating survival times to simulate Cox proportional hazards models , 2005, Statistics in medicine.

[17]  G. Tutz,et al.  Flexible modelling of discrete failure time including time‐varying smooth effects , 2004, Statistics in medicine.

[18]  Carmen Cadarso-Suárez,et al.  Model building in nonproportional hazard regression , 2013, Statistics in medicine.

[19]  Marcus Schuchmann,et al.  Trends in Epidemiology, Treatment, and Survival of Hepatocellular Carcinoma Patients Between 1998 and 2009: An Analysis of 1066 Cases of a German HCC Registry , 2014, Journal of clinical gastroenterology.

[20]  J. Klein,et al.  Generalised linear models for correlated pseudo‐observations, with applications to multi‐state models , 2003 .

[21]  Robert Gray,et al.  A Proportional Hazards Model for the Subdistribution of a Competing Risk , 1999 .

[22]  M. Schumacher,et al.  On pseudo-values for regression analysis in competing risks models , 2009, Lifetime data analysis.

[23]  Thomas A Gerds,et al.  Absolute risk regression for competing risks: interpretation, link functions, and prediction , 2012, Statistics in medicine.

[24]  Schumacher Martin,et al.  Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples , 2008 .

[25]  Willi Sauerbrei,et al.  Comparison of procedures to assess non‐linear and time‐varying effects in multivariable models for survival data , 2011, Biometrical journal. Biometrische Zeitschrift.

[26]  Martin Schumacher,et al.  Time-dependent covariates in the proportional subdistribution hazards model for competing risks. , 2008, Biostatistics.