Exact Subsampling MCMC

Speeding up Markov Chain Monte Carlo (MCMC) for datasets with many observations by data subsampling has recently received considerable attention in the literature. Most of the proposed methods are approximate, and the only exact solution has been documented to be highly inefficient. We propose a simulation consistent subsampling method for estimating expectations of any function of the parameters using a combination of MCMC subsampling and the importance sampling correction for occasionally negative likelihood estimates in Lyne et al. (2015). Our algorithm is based on first obtaining an unbiased but not necessarily positive estimate of the likelihood. The estimator uses a soft lower bound such that the likelihood estimate is positive with a high probability, and computationally cheap control variables to lower variability. Second, we carry out a correlated pseudo marginal MCMC on the absolute value of the likelihood estimate. Third, the sign of the likelihood is corrected using an importance sampling step that has low variance by construction. We illustrate the usefulness of the method with two examples.