Automatic choice of driving values in Monte Carlo likelihood approximation via posterior simulations

For models with random effects or missing data, the likelihood function is sometimes intractable analytically but amenable to Monte Carlo approximation. To get a good approximation, the parameter value that drives the simulations should be sufficiently close to the maximum likelihood estimate (MLE) which unfortunately is unknown. Introducing a working prior distribution, we express the likelihood function as a posterior expectation and approximate it using posterior simulations. If the sample size is large, the sample information is likely to outweigh the prior specification and the posterior simulations will be concentrated around the MLE automatically, leading to good approximation of the likelihood near the MLE. For smaller samples, we propose to use the current posterior as the next prior distribution to make the posterior simulations closer to the MLE and hence improve the likelihood approximation. By using the technique of data duplication, we can simulate from the sharpened posterior distribution without actually updating the prior distribution. The suggested method works well in several test cases. A more complex example involving censored spatial data is also discussed.

[1]  Charles J. Geyer,et al.  Reweighting Monte Carlo Mixtures , 1991 .

[2]  Dani Gamerman,et al.  Sampling from the posterior distribution in generalized linear mixed models , 1997, Stat. Comput..

[3]  Ranjini Natarajan,et al.  Gibbs Sampling with Diffuse Proper Priors: A Valid Approach to Data-Driven Inference? , 1998 .

[4]  P. Donnelly,et al.  Inference in molecular population genetics , 2000 .

[5]  Anthony Y. C. Kuk,et al.  Pointwise and functional approximations in Monte Carlo maximum likelihood estimation , 1999, Stat. Comput..

[6]  A. Kuk,et al.  MAXIMUM LIKELIHOOD ESTIMATION FOR PROBIT-LINEAR MIXED MODELS WITH CORRELATED RANDOM EFFECTS , 1997 .

[7]  Elizabeth A. Thompson,et al.  MCMC Estimation of Multi‐locus Genome Sharing and Multipoint Gene Location Scores , 2000 .

[8]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[9]  C. McCulloch Maximum Likelihood Variance Components Estimation for Binary Data , 1994 .

[10]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[11]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[12]  Scott L. Zeger,et al.  Generalized linear models with random e ects: a Gibbs sampling approach , 1991 .

[13]  Peter McCullagh,et al.  Laplace Approximation of High Dimensional Integrals , 1995 .

[14]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[15]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[16]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[17]  C. S. Weil Selection of the valid number of sampling units and a consideration of their combination in toxicological studies involving reproduction, teratogenesis or carcinogenesis. , 1970, Food and cosmetics toxicology.

[18]  C. Mcgilchrist Estimation in Generalized Mixed Models , 1994 .

[19]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[20]  Alan E. Gelfand,et al.  Bayesian statistics without tears: A sampling-resampling perspective , 1992 .

[21]  Anthony Y. C. Kuk Laplace Importance Sampling for Generalized Linear Mixed Models , 1999 .

[22]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[23]  Radford M. Neal Monte Carlo Implementation , 1996 .

[24]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[25]  Jon A Yamato,et al.  Maximum likelihood estimation of population growth rates based on the coalescent. , 1998, Genetics.

[26]  Problems with computational methods in population ge - , 1999 .

[27]  K. Chan,et al.  Monte Carlo EM Estimation for Time Series Models Involving Counts , 1995 .

[28]  S L Zeger,et al.  Generalized linear models with random effects; salamander mating revisited. , 1992, Biometrics.

[29]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[30]  Michael L. Stein,et al.  Prediction and Inference for Truncated Spatial Data , 1992 .