The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem

Abstract This article describes a method of “grouping” and “collapsing” in using the Gibbs sampler and proves from an operator theory viewpoint that the method is in general beneficial. The norms of the forward operators associated with the corresponding nonreversible Markov chains are used to discriminate among different simulation schemes. When applied to Bayesian missing data problems, the idea of collapsing suggests skipping the steps of sampling parameter(s) values in standard data augmentation. By doing this, we obtain a predictive update version of the Gibbs sampler. A procedure of calculating the posterior odds ratio via the collapsed Gibbs sampler when incomplete observations are involved is presented. As an illustration of possible applications, three examples, along with a Bayesian treatment for identifying common protein binding sites in unaligned DNA sequences, are provided.

[1]  T. Ferguson Prior Distributions on Spaces of Probability Measures , 1974 .

[2]  H. O. Lancaster The Structure of Bivariate Distributions , 1958 .

[3]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[4]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo Method , 1981 .

[5]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[6]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[7]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[8]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[9]  J. Besag,et al.  Spatial Statistics and Bayesian Computation , 1993 .

[10]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[11]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[12]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[13]  John Aitchison,et al.  Statistical Prediction Analysis , 1975 .

[14]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[15]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[16]  A. Dawid,et al.  Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[17]  Adrian F. M. Smith,et al.  Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[18]  P. Diaconis,et al.  Geometric Bounds for Eigenvalues of Markov Chains , 1991 .

[19]  C. Morris Natural Exponential Families with Quadratic Variance Functions: Statistical Theory , 1983 .

[20]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[21]  P. Diaconis Group representations in probability and statistics , 1988 .

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  J. A. Fill Eigenvalue bounds on convergence to stationarity for nonreversible markov chains , 1991 .

[24]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[25]  A. A. Reilly,et al.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences , 1990, Proteins.

[26]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[27]  Peter Green,et al.  Spatial statistics and Bayesian computation (with discussion) , 1993 .

[28]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.