Model Selection for the Competing-Risks Model With and Without Masking

The competing-risks model is useful in settings in which individuals (or units) may die (or fail) because of various causes. It can also be the case that for some of the items, the cause of failure is known only up to a subgroup of all causes, in which case we say that the failure is group-masked. A widely used approach for competing-risks data with and without masking involves the specification of cause-specific hazard rates. Often, because of the availability of likelihood methods for estimation and testing, piecewise constant hazards are used. The piecewise constant rates also offer model flexibility and computational convenience. However, for such piecewise constant hazard models, the choice of the endpoints for each interval on which the hazards are constant is usually a subjective one. In this article we discuss and propose the use of model selection methods that are data-driven and automatic. We compare three model selection procedures based on the minimum description length principle, the Bayes information criterion, and the Akaike information criterion. A fast-splitting algorithm is the computational tool used to select among an enormous number of possible models. We test the effectiveness of the methods through numerical studies, including a real dataset with masked failure causes.

[1]  Eliane Gluckman,et al.  Marrow transplantation for acute lymphoblastic leukemia: factors affecting relapse and survival. , 1989 .

[2]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[3]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[4]  G Lapidus,et al.  Accuracy of fatal motorcycle-injury reporting on death certificates. , 1994, Accident; analysis and prevention.

[5]  Radu V. Craiu,et al.  Inference based on the EM algorithm for the competing risks model with masked causes of failure , 2004 .

[6]  Joseph D. Conklin Classical Competing Risks , 2002, Technometrics.

[7]  Howard M. Taylor,et al.  THE POISSON-WEIBULL FLAW MODEL FOR BRITTLE FIBER STRENGTH , 1994 .

[8]  David R. Anderson,et al.  Model Selection and Inference: A Practical Information-Theoretic Approach , 2001 .

[9]  Gregg E. Dinse,et al.  Nonparametric prevalence and mortality estimators for animal experiments with incomplete cause-of-death data , 1986 .

[10]  D G Hoel,et al.  Nonparametric estimation of the survival function when cause of death is uncertain. , 1984, Biometrics.

[11]  Els Goetghebeur,et al.  Analysis of competing risks survival data when some failure types are missing , 1995 .

[12]  Wayne Nelson,et al.  Hazard Plotting for Incomplete Failure Data , 1969 .

[13]  Els Goetghebeur,et al.  A modified log rank test for competing risks with missing failure type , 1990 .

[14]  Emmanuel Yashchin,et al.  Parametric Modeling for Survival with Competing Risks and Masked Failure Causes , 2002, Lifetime data analysis.

[15]  O. Borgan The Statistical Analysis of Failure Time Data (2nd Ed.). John D. Kalbfleisch and Ross L. Prentice , 2003 .

[16]  Jeffrey J. Gaynor,et al.  On the Use of Cause-Specific Failure and Conditional Failure Probabilities: Examples from Clinical Oncology Data , 1993 .

[17]  V T Farewell,et al.  The analysis of failure times in the presence of competing risks. , 1978, Biometrics.

[18]  H. A. David,et al.  Life Tests under Competing Causes of Failure and the Theory of Competing Risks , 1971 .

[19]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[20]  Alfonso García Pérez Nonparametric estimation: The survival function , 1984 .

[21]  Anup Dewanji,et al.  A note on a test for competing risks with missing failure type , 1992 .

[22]  Wenqing He,et al.  Flexible Maximum Likelihood Methods for Bivariate Proportional Hazards Models , 2003, Biometrics.

[23]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[24]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[25]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[26]  O. Aalen Nonparametric Inference for a Family of Counting Processes , 1978 .

[27]  D G Hoel,et al.  A representation of mortality data by competing risks. , 1972, Biometrics.

[28]  Radu V. Craiu,et al.  Using EM and Data Augmentation for the Competing Risks Model , 2005 .

[29]  C D Buckner,et al.  Marrow transplantation for acute lymphoblastic leukemia. , 1984, Seminars in hematology.

[30]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[31]  Shaw-Hwa Lo,et al.  Estimating a survival function with incomplete cause-of-death data , 1991 .

[32]  J. D. Holt,et al.  Competing risk analyses with special reference to matched pair experiments , 1978 .

[33]  Ram C. Tiwari,et al.  Comparing cumulative incidence functions of a competing-risks model , 1997 .

[34]  Ralph L. Kodell,et al.  Handling cause of death in equivocal cases using the em algorithm , 1987 .

[35]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[36]  Emmanuel Yashchin,et al.  Survival with competing risks and masked causes of failures , 1998 .

[37]  Thomas C. M. Lee,et al.  An Introduction to Coding Theory and the Two‐Part Minimum Description Length Principle , 2001 .

[38]  Tim Robertson,et al.  Likelihood based inference for cause specific hazard rates under order restrictions , 1995 .

[39]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[40]  Buckner Cd,et al.  MARROW TRANSPLANTATION FOR ACUTE LYMPHOBLASTIC LEUKAEMIA , 1982, The Lancet.

[41]  J. Lawless Statistical Models and Methods for Lifetime Data , 2002 .

[42]  S. Lagakos A Covariate Model for Partially Censored Data Subject to Competing Causes of Failure , 1978 .