An adaptive MCMC method for multiple changepoint analysis with applications to large datasets

We consider the problem of Bayesian inference for changepoints where the number and position of the changepoints are both unknown. In particular, we consider product partition models where it is possible to integrate out model parameters for the regime between each changepoint, leaving a posterior distribution over a latent vector indicating the presence or not of a changepoint at each observation. The same problem setting has been considered by Fearnhead (2006) where one can use filtering recursions to make exact inference. However the complexity of this algorithm depends quadratically on the number of observations. Our approach relies on an adaptive Markov Chain Monte Carlo (MCMC) method for finite discrete state spaces. We develop an adaptive algorithm which can learn from the past states of the Markov chain in order to build proposal distributions which can quickly discover where changepoint are likely to be located. We prove that our algorithm leaves the posterior distribution ergodic. Crucially, we demonstrate that our adaptive MCMC algorithm is viable for large datasets for which the filtering recursions approach is not. Moreover, we show that inference is possible in a reasonable time.

[1]  Jeffrey Scott Vitter,et al.  Dynamic Generation of Discrete Random Variates , 1993, SODA '93.

[2]  Jeffrey S. Rosenthal,et al.  The Containment Condition and Adapfail Algorithms , 2014, J. Appl. Probab..

[3]  J. Hartigan,et al.  Product Partition Models for Change Point Problems , 1992 .

[4]  A. Raftery,et al.  Bayesian analysis of a Poisson process with a change-point , 1986 .

[5]  S. Ogawa,et al.  Oncogenic mutations of ALK kinase in neuroblastoma , 2008, Nature.

[6]  N. Friel,et al.  Simulation-based Bayesian analysis for multiple changepoints , 2010, 1011.2932.

[7]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[8]  Arjun K. Gupta,et al.  Parametric Statistical Change Point Analysis , 2000 .

[9]  Francis R. Bach,et al.  SegAnnDB: interactive Web-based genomic segmentation , 2014, Bioinform..

[10]  S. Chib Estimation and comparison of multiple change-point models , 1998 .

[11]  Haavard Rue,et al.  Approximate simulation-free Bayesian inference for multiple changepoint models with dependence within segments , 2010, 1011.5038.

[12]  Paul Fearnhead,et al.  Exact and efficient Bayesian inference for multiple changepoint problems , 2006, Stat. Comput..

[13]  Jim Griffin,et al.  Individual adaptation: an adaptive MCMC scheme for variable selection problems , 2014 .

[14]  Jeffrey S. Rosenthal,et al.  Coupling and Ergodicity of Adaptive MCMC , 2007 .

[15]  J. Yellott The relationship between Luce's Choice Axiom, Thurstone's Theory of Comparative Judgment, and the double exponential distribution , 1977 .

[16]  Marc Lavielle,et al.  An application of MCMC methods for the multiple change-points problem , 2001, Signal Process..

[17]  Nando de Freitas,et al.  Adaptive MCMC with Bayesian Optimization , 2012, AISTATS.

[18]  A. J. Walker New fast method for generating discrete random numbers with arbitrary frequency distributions , 1974 .

[19]  Michael D. Vose,et al.  The simple genetic algorithm - foundations and theory , 1999, Complex adaptive systems.

[20]  D. Stephens Bayesian Retrospective Multiple‐Changepoint Identification , 1994 .

[21]  K. Riedel Numerical Bayesian Methods Applied to Signal Processing , 1996 .

[22]  K OrJ Numerical Bayesian methods applied to signal processing , 1996 .

[23]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[24]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[25]  P. Fearnhead,et al.  Improved particle filter for nonlinear problems , 1999 .