Bayesian Nonparametric Estimation for Incomplete Data Via Successive Substitution Sampling

In the problem of estimating an unknown distribution function F in the presence of censoring, one can use a nonparametric estimator such as the Kaplan-Meier estimator, or one can consider parametric modeling. There are many situations where physical reasons indicate that a certain parametric model holds approximately. In these cases a nonparametric estimator may be very inefficient relative to a parametric estimator. On the other hand, if the parametric model is only a crude approximation to the actual model, then the parametric estimator may perform poorly relative to the nonparametric estimator, and may even be inconsistent. The Bayesian paradigm provides a reasonable framework for this problem. In a Bayesian approach, one would try to put a prior distribution on F that gives most of its mass to small neighborhoods of the entire parametric family. We show that certain priors based on the Dirichlet process prior can be used to accomplish this. For these priors the posterior distribution of F given the censored data appears to be analytically intractable. We provide a method for approximating this posterior distribution through the use of a successive substitution sampling algorithm. We also show convergence of the algorithm

[1]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[2]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[3]  T. Ferguson Prior Distributions on Spaces of Probability Measures , 1974 .

[4]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[5]  B. Turnbull Nonparametric Estimation of a Survivorship Function with Doubly Censored Data , 1974 .

[6]  J. V. Ryzin,et al.  Nonparametric Bayesian Estimation of Survival Curves from Incomplete Observations , 1976 .

[7]  James A. Koziol,et al.  A Cramér-von Mises statistic for randomly censored data , 1976 .

[8]  B. Turnbull The Empirical Distribution Function with Arbitrarily Grouped, Censored, and Truncated Data , 1976 .

[9]  M. Hollander,et al.  Testing to Determine the Underlying Distribution Using Randomly Censored Data. , 1979 .

[10]  H. Ramlau-Hansen Smoothing Counting Process Intensities by Means of Kernel Functions , 1983 .

[11]  D. Freedman,et al.  ON INCONSISTENT BAYES ESTIMATES IN THE DISCRETE CASE , 1983 .

[12]  R. G. Miller,et al.  What price Kaplan-Meier? , 1983, Biometrics.

[13]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  C. Link,et al.  Confidence intervals for the survival function using Cox's proportional-hazard model with covariates. , 1984, Biometrics.

[15]  Hani Doss Bayesian Nonparametric Estimation of the Median; Part II: Asymptotic Properties of the Estimates , 1985 .

[16]  D. Freedman,et al.  On the consistency of Bayes estimates , 1986 .

[17]  D. Freedman,et al.  On inconsistent Bayes estimates of location , 1986 .

[18]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[19]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[20]  Myles Hollander,et al.  A Chi-Squared Goodness-of-Fit Test for Randomly Censored Data , 1992 .

[21]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[22]  K. Athreya,et al.  ON THE CONVERGENCE OF THE MARKOV CHAIN SIMULATION METHOD , 1996 .