Simulating biologically plausible complex survival data

Simulation studies are conducted to assess the performance of current and novel statistical models in pre-defined scenarios. It is often desirable that chosen simulation scenarios accurately reflect a biologically plausible underlying distribution. This is particularly important in the framework of survival analysis, where simulated distributions are chosen for both the event time and the censoring time. This paper develops methods for using complex distributions when generating survival times to assess methods in practice. We describe a general algorithm involving numerical integration and root-finding techniques to generate survival times from a variety of complex parametric distributions, incorporating any combination of time-dependent effects, time-varying covariates, delayed entry, random effects and covariates measured with error. User-friendly Stata software is provided.

[1]  Paul C. Lambert,et al.  Simulating Complex Survival Data , 2012 .

[2]  Michal Abrahamowicz,et al.  Marginal and hazard ratio specific random data generation: Applications to semi-parametric bootstrapping , 2002, Stat. Comput..

[3]  J. Benichou,et al.  Choice of time‐scale in Cox's model analysis of epidemiologic cohort data: a simulation study , 2004, Statistics in medicine.

[4]  Lawrence Leemis Technical Note - Variate Generation for Accelerated Life and Proportional Hazards Models , 1987, Oper. Res..

[5]  Michal Abrahamowicz,et al.  Flexible modeling of competing risks in survival analysis , 2010, Statistics in medicine.

[6]  Patrick Royston,et al.  Multivariable Model-Building: A Pragmatic Approach to Regression Analysis based on Fractional Polynomials for Modelling Continuous Variables , 2008 .

[7]  Jong-Hyeon Jeong,et al.  Breast cancer adjuvant therapy: time to consider its time-dependent effects. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[8]  Henrik Møller,et al.  Quantifying differences in breast cancer survival between England and Norway. , 2011, Cancer epidemiology.

[9]  Paul C. Lambert,et al.  Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model , 2011 .

[10]  Patrick Royston,et al.  The design of simulation studies in medical statistics , 2006, Statistics in medicine.

[11]  Patrick Royston,et al.  Tools to Simulate Realistic Censored Survival-Time Distributions , 2012 .

[12]  Ralf Bender,et al.  Generating survival times to simulate Cox proportional hazards models , 2005, Statistics in medicine.

[13]  P. Grambsch,et al.  Primary biliary cirrhosis: Prediction of short‐term survival based on repeated patient visits , 1994, Hepatology.

[14]  Michal Abrahamowicz,et al.  Comparison of algorithms to generate event times conditional on time‐dependent covariates , 2008, Statistics in medicine.

[15]  L. Tanoue,et al.  Gefitinib or Carboplatin–Paclitaxel in Pulmonary Adenocarcinoma , 2010 .

[16]  D.,et al.  Regression Models and Life-Tables , 2022 .

[17]  J. Copas,et al.  Sensitivity analysis for informative censoring in parametric survival models. , 2005, Biostatistics.

[18]  Martin Schumacher,et al.  Simulating competing risks data in survival analysis , 2009, Statistics in medicine.

[19]  P. Austin Generating survival times to simulate Cox proportional hazards models with time-varying covariates , 2012, Statistics in medicine.

[20]  M. Federico,et al.  Estimating survival in newly diagnosed cancer patients: Use of computer simulations to evaluate performances of different approaches in a wide range of scenarios , 2008, Statistics in medicine.

[21]  Paul W Dickman,et al.  Partitioning of excess mortality in population-based cancer patient survival studies using flexible parametric survival models , 2012, BMC Medical Research Methodology.

[22]  J. Klein,et al.  Statistical Models Based On Counting Process , 1994 .

[23]  W. Sauerbrei,et al.  Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. , 1994, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[24]  J. Stoer,et al.  Introduction to Numerical Analysis , 2002 .

[25]  Keith R Abrams,et al.  Flexible parametric joint modelling of longitudinal and survival data , 2012, Statistics in medicine.

[26]  Michael J. Crowther SURVSIM: Stata module to simulate complex survival data , 2013 .

[27]  Paul C. Lambert,et al.  Estimating the cure fraction in population‐based cancer studies by using finite mixture models , 2010 .

[28]  G. McLachlan,et al.  On the role of finite mixture models in survival analysis , 1994, Statistical methods in medical research.

[29]  Sam Harper,et al.  Use of relative and absolute effect measures in reporting health inequalities: structured review , 2012, BMJ : British Medical Journal.

[30]  Ben Jann MOREMATA: Stata module (Mata) to provide various functions , 2005 .

[31]  Paul C. Lambert,et al.  Further Development of Flexible Parametric Models for Survival Analysis , 2009 .