A versatile test for equality of two survival functions based on weighted differences of Kaplan–Meier curves

With censored event time observations, the logrank test is the most popular tool for testing the equality of two underlying survival distributions. Although this test is asymptotically distribution free, it may not be powerful when the proportional hazards assumption is violated. Various other novel testing procedures have been proposed, which generally are derived by assuming a class of specific alternative hypotheses with respect to the hazard functions. The test considered by Pepe and Fleming (1989) is based on a linear combination of weighted differences of the two Kaplan-Meier curves over time and is a natural tool to assess the difference of two survival functions directly. In this article, we take a similar approach but choose weights that are proportional to the observed standardized difference of the estimated survival curves at each time point. The new proposal automatically makes weighting adjustments empirically. The new test statistic is aimed at a one-sided general alternative hypothesis and is distributed with a short right tail under the null hypothesis but with a heavy tail under the alternative. The results from extensive numerical studies demonstrate that the new procedure performs well under various general alternatives with a caution of a minor inflation of the type I error rate when the sample size is small or the number of observed events is small. The survival data from a recent cancer comparative study are utilized for illustrating the implementation of the process.

[1]  R. Fonseca,et al.  Lenalidomide plus high-dose dexamethasone versus lenalidomide plus low-dose dexamethasone as initial therapy for newly diagnosed multiple myeloma: an open-label randomised controlled trial. , 2010, The Lancet. Oncology.

[2]  T. Fleming,et al.  Adaptive Test for Testing the Difference in Survival Distributions , 2003, Lifetime data analysis.

[3]  P. Royston,et al.  The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt , 2011, Statistics in medicine.

[4]  S G Self,et al.  An adaptive weighted log-rank test with application to cancer prevention and screening trials. , 1991, Biometrics.

[5]  R. Tarone,et al.  On the distribution of the maximum of the longrank statistic and the modified Wilcoxon statistic , 1981 .

[6]  D. Zucker,et al.  Weighted log rank type statistics for comparing survival curves when there is a time lag in the effectiveness of treatment , 1990 .

[7]  Michael R. Kosorok,et al.  The Versatility of Function-Indexed Weighted Log-Rank Statistics , 1999 .

[8]  James H. Ware,et al.  On distribution-free tests for equality of survival distributions , 1977 .

[9]  Lee-Jen Wei,et al.  Confidence bands for survival curves under the proportional , 1994 .

[10]  D. Harrington,et al.  Counting Processes and Survival Analysis , 1991 .

[11]  Song Yang Semiparametric inference on the absolute risk reduction and the restricted mean survival difference , 2013, Lifetime data analysis.

[12]  Xin Xu,et al.  Combining dependent tests for linkage or association across multiple phenotypic traits. , 2003, Biostatistics.

[13]  John D. Kalbfleisch,et al.  Misspecified proportional hazard models , 1986 .

[14]  Michael Parzen,et al.  Simultaneous Confidence Intervals for the Difference of Two Survival Functions , 1997 .

[15]  S W Lagakos,et al.  Properties of proportional-hazards score tests under misspecified regression models. , 1984, Biometrics.

[16]  M. Schumacher,et al.  Two-Sample Tests of Cramér--von Mises- and Kolmogorov--Smirnov-Type for Randomly Censored Data@@@Two-Sample Tests of Cramer--von Mises- and Kolmogorov--Smirnov-Type for Randomly Censored Data , 1984 .

[17]  Joseph L. Gastwirth,et al.  The Use of Maximin Efficiency Robust Tests in Combining Contingency Tables and Survival Analysis , 1985 .

[18]  Zhiliang Ying,et al.  Rank Regression Methods for Left-Truncated and Right-Censored Data , 1991 .

[19]  Jae Won Lee,et al.  SOME VERSATILE TESTS BASED ON THE SIMULTANEOUS USE OF WEIGHTED LOG-RANK STATISTICS , 1996 .

[20]  A A Tsiatis,et al.  Sequential Methods for Comparing Years of Life Saved in the Two‐Sample Censored Data Problem , 1999, Biometrics.

[21]  Ross L. Prentice,et al.  Semiparametric analysis of short-term and long-term hazard ratios with two-sample survival data , 2005 .

[22]  Thomas R. Fleming,et al.  Weighted Kaplan‐Meier Statistics: Large Sample and Optimality Considerations , 1991 .

[23]  Lu Tian,et al.  On the Restricted Mean Event Time in Survival Analysis , 2013 .

[24]  Lihui Zhao,et al.  Utilizing the integrated difference of two survival functions to quantify the treatment contrast for designing, monitoring, and analyzing a comparative clinical study , 2012, Clinical trials.

[25]  Song Yang,et al.  Improved Logrank‐Type Tests for Survival Data Using Adaptive Weights , 2010, Biometrics.

[26]  M S Pepe,et al.  Weighted Kaplan-Meier statistics: a class of distance tests for censored survival data. , 1989, Biometrics.

[27]  J O'Quigley,et al.  Estimating average regression effect under non-proportional hazards. , 2000, Biostatistics.

[28]  G Heimann,et al.  Permutational distribution of the log-rank statistic under random censorship with applications to carcinogenicity assays. , 1998, Biometrics.

[29]  R. Gill Large Sample Behaviour of the Product-Limit Estimator on the Whole Line , 1983 .

[30]  R. Latta,et al.  A Monte Carlo Study of Some Two-Sample Rank Tests with Censored Data , 1981 .

[31]  Lihui Zhao,et al.  Predicting the restricted mean event time with the subject's baseline covariates in survival analysis. , 2014, Biostatistics.

[32]  G. Neuhaus,et al.  Conditional Rank Tests for the Two-Sample Problem Under Random Censorship , 1993 .