Non‐parametric tests for right‐censored data with biased sampling

Testing the equality of two survival distributions can be difficult in a prevalent cohort study when non random sampling of subjects is involved. Due to the biased sampling scheme, independent censoring assumption is often violated. Although the issues about biased inference caused by length-biased sampling have been widely recognized in statistical, epidemiological and economical literature, there is no satisfactory solution for efficient two-sample testing. We propose an asymptotic most efficient nonparametric test by properly adjusting for length-biased sampling. The test statistic is derived from a full likelihood function, and can be generalized from the two-sample test to a k-sample test. The asymptotic properties of the test statistic under the null hypothesis are derived using its asymptotic independent and identically distributed representation. We conduct extensive Monte Carlo simulations to evaluate the performance of the proposed test statistics and compare them with the conditional test and the standard logrank test for different biased sampling schemes and right-censoring mechanisms. For length-biased data, empirical studies demonstrated that the proposed test is substantially more powerful than the existing methods. For general left-truncated data, the proposed test is robust, still maintains accurate control of type I error rate, and is also more powerful than the existing methods, if the truncation patterns and right-censoring patterns are the same between the groups. We illustrate the methods using two real data examples.

[1]  N. Mantel Evaluation of survival data and two new rank order statistics arising in its consideration. , 1966, Cancer chemotherapy reports.

[2]  David P. Harrington,et al.  Supremum versions of the log-rank and generalized wilcoxon statistics , 1987 .

[3]  B. Turnbull The Empirical Distribution Function with Arbitrarily Grouped, Censored, and Truncated Data , 1976 .

[4]  Micha Mandel,et al.  The accelerated failure time model under biased sampling. , 2010, Biometrics.

[5]  Hassen A. Muttlak,et al.  Ranked set sampling with respect to concomitant variables and with size biased probability of selection , 1990 .

[6]  David B Wolfson,et al.  Checking stationarity of the incidence rate using prevalent cohort survival data , 2006, Statistics in medicine.

[7]  S. Cole,et al.  Accounting for leadtime in cohort studies: evaluating when to initiate HIV therapies , 2004, Statistics in medicine.

[8]  N Keiding,et al.  Selection bias in determining the age dependence of waiting time to pregnancy. , 2000, American journal of epidemiology.

[9]  R. Dennis Cook,et al.  A Model for Quadrat Sampling with “Visibility Bias” , 1974 .

[10]  J. Peto,et al.  Asymptotically Efficient Rank Invariant Test Procedures , 1972 .

[11]  G. ÁLvarez-Llorente,et al.  Estimation under length-bias and right-censoring: An application to unemployment duration analysis for married women , 2003 .

[12]  D E Weeks,et al.  True and false positive peaks in genomewide scans: applications of length-biased sampling to linkage mapping. , 1997, American journal of human genetics.

[13]  Ruiguang Song,et al.  Estimating the distribution of a renewal process from times at which events from an independent process are detected. , 2006, Biometrics.

[14]  W. Bilker,et al.  A semiparametric extension of the Mann-Whitney test for randomly truncated data. , 1996, Biometrics.

[15]  Marvin Zelen,et al.  Forward and Backward Recurrence Times and Length Biased Sampling: Age Specific Models , 2004, Lifetime data analysis.

[16]  P. Shen A General Class of Test Procedures for Left-Truncated and Right-Censored Data , 2007 .

[17]  Franklin A. Graybill,et al.  Theory and Application of the Linear Model , 1976 .

[18]  Masoud Asgharian,et al.  Covariate Bias Induced by Length-Biased Sampling of Failure Times , 2008 .

[19]  Mei-Cheng Wang,et al.  Hazards regression analysis for length-biased data , 1996 .

[20]  C. R. Rao,et al.  Weighted distributions and size-biased sampling with applications to wildlife populations and human families , 1978 .

[21]  Y. Vardi,et al.  Nonparametric Estimation in the Presence of Length Bias , 1982 .

[22]  Tony Lancaster,et al.  ECONOMETRIC METHODS FOR THE DURATION OF UNEMPLOYMENT , 1979 .

[23]  C. Begg On the use of familial aggregation in population-based case probands for calculating penetrance. , 2002, Journal of the National Cancer Institute.

[24]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[25]  Nancy Reid,et al.  Estimating the median survival time , 1981 .

[26]  David B Wolfson,et al.  Length-Biased Sampling With Right Censoring , 2002 .

[27]  Niels Keiding,et al.  Design and analysis of time-to-pregnancy , 2006, Statistical methods in medical research.

[28]  Mei-Cheng Wang,et al.  Nonparametric Estimation from Cross-Sectional Survival Data , 1991 .

[29]  R. Gill Censoring and stochastic integrals , 1980 .

[30]  Marvin Zelen,et al.  On the theory of screening for chronic diseases , 1969 .

[31]  R. Jennrich A note on the behaviour of the log rank permutation test under unequal censoring , 1983 .

[32]  A. Janssen,et al.  How do bootstrap and permutation tests work , 2003 .

[33]  David B Wolfson,et al.  A formal test for the stationarity of the incidence rate using data from a prevalent cohort study with follow-up , 2006, Lifetime data analysis.

[34]  Masoud Asgharian,et al.  Asymptotic behavior of the unconditional NPMLE of the length-biased survivor function from right censored prevalent cohort data , 2005, math/0602239.

[35]  S. D. Wicksell,et al.  THE CORPUSCLE PROBLEM. A MATHEMATICAL STUDY OF A BIOMETRIC PROBLEM , 1925 .

[36]  O. Aalen Nonparametric Inference for a Family of Counting Processes , 1978 .

[37]  D. Schoenfeld,et al.  A proportional hazards model for truncated AIDS data. , 1993, Biometrics.

[38]  P. McCullagh Sampling bias and logistic models , 2008 .

[39]  S. Blumenthal Proportional Sampling in Life Length Studies , 1967 .

[40]  V. De Gruttola,et al.  Nonparametric analysis of truncated survival data, with application to AIDS , 1988 .

[41]  T Ostbye,et al.  A reevaluation of the duration of survival after the onset of dementia. , 2001, The New England journal of medicine.

[42]  Mitchell H. Gail,et al.  AIDS Epidemiology: A Quantitative Approach , 1994 .

[43]  Paul H. Kvam,et al.  Length Bias in the Measurements of Carbon Nanotubes , 2008, Technometrics.

[44]  Yehuda Vardi,et al.  Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation , 1989 .