Bayesian estimation of dynamic weights in Gaussian mixture models

This paper proposes a generalization of Gaussian mixture models, where the mixture weight is allowed to behave as an unknown function of time. This model is capable of successfully capturing the features of the data, as demonstrated by simulated and real datasets. It can be useful in studies such as clustering, change-point and process control. In order to estimate the mixture weight function, we propose two new Bayesian nonlinear dynamic approaches for polynomial models, that can be extended to other problems involving polynomial nonlinear dynamic models. One of the methods, called here component-wise Metropolis-Hastings, apply the Metropolis-Hastings algorithm to each local level component of the state equation. It is more general and can be used in any situation where the observation and state equations are nonlinearly connected. The other method tends to be faster, but is applied specifically to binary data (using the probit link function). The performance of these methods of estimation, in the context of the proposed dynamic Gaussian mixture model, is evaluated through simulated datasets. Also, an application to an array Comparative Genomic Hybridization (aCGH) dataset from glioblastoma cancer illustrates our proposal, highlighting the ability of the method to detect chromosome aberrations.

[1]  Bo Zhang,et al.  Stochastic Volatility Models with ARMA Innovations an Application to G7 Inflation Forecasts , 2018, International Journal of Forecasting.

[2]  Karin Ackermann,et al.  Bayesian Forecasting And Dynamic Models Springer Series In Statistics , 2016 .

[3]  C. Robert,et al.  Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method , 2000 .

[4]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[5]  Petros Dellaportas,et al.  Forecasting with non-homogeneous hidden Markov models , 2011, Stat. Comput..

[6]  Simon M. Potter,et al.  A New Model of Trend Inflation , 2012 .

[7]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[8]  Joshua C. C. Chan,et al.  Estimation in Non-Linear Non-Gaussian State Space Models with Precision-Based Methods , 2012 .

[9]  R. Kohn,et al.  On Gibbs sampling for state space models , 1994 .

[10]  Padhraic Smyth,et al.  Bayesian nonhomogeneous Markov models via Pólya-Gamma data augmentation with applications to rainfall modeling , 2017, 1701.02856.

[11]  Luigi Spezia,et al.  Bayesian analysis of multivariate Gaussian hidden Markov models with an unknown number of regimes , 2009 .

[12]  M. West,et al.  Dynamic Generalized Linear Models and Bayesian Forecasting , 1985 .

[13]  Luigi Spezia,et al.  Bayesian variable selection in non-homogeneous hidden Markov models through an evolutionary Monte Carlo method , 2020, Comput. Stat. Data Anal..

[14]  Joshua C. C. Chan,et al.  Efficient simulation and integrated likelihood estimation in state space models , 2009, Int. J. Math. Model. Numer. Optimisation.

[15]  G. Wahba Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression , 1978 .

[16]  L. Milan,et al.  Clustering Gene Expression Data using a Posterior Split‐Merge‐Birth Procedure , 2012 .

[17]  Joshua C. C. Chan,et al.  Bayesian Model Comparison for Time-Varying Parameter VARs with Stochastic Volatility , 2015 .

[18]  Ludwig Fahrmeir,et al.  Bayesian spline-type smoothing in generalized regression models , 1996 .

[19]  Peter J. Park,et al.  Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data , 2005, Bioinform..

[20]  R. Kohn,et al.  A new algorithm for spline smoothing based on smoothing a stochastic process , 1987 .

[21]  D. Michael Titterington,et al.  Statistical Modeling and Computation , 2005 .

[22]  Wavelet‐based estimators for mixture regression , 2018, Scandinavian Journal of Statistics.

[23]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[24]  Rodney W. Strachan,et al.  Reducing the state space dimension in a large TVP-VAR , 2020, Journal of Econometrics.

[25]  Gerhard Tutz,et al.  State Space and Hidden Markov Models , 2001 .

[26]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[27]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[28]  S. Frühwirth-Schnatter Data Augmentation and Dynamic Linear Models , 1994 .

[29]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[30]  G. Wahba Spline models for observational data , 1990 .

[31]  T. Rydén,et al.  Stylized Facts of Daily Return Series and the Hidden Markov Model , 1998 .

[32]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[33]  Ute Beyer,et al.  Bayesian Forecasting And Dynamic Models , 2016 .

[34]  William J. McCausland,et al.  Simulation smoothing for state-space models: A computational efficiency analysis , 2011, Comput. Stat. Data Anal..

[35]  Robert W. Rich,et al.  Oil and the Macroeconomy: A Markov State-Switching Approach , 1997 .

[36]  Peter Congdon,et al.  Gaussian Markov Random Fields: Theory and Applications , 2007 .

[37]  Marc Sebban,et al.  Supervised learning of Gaussian mixture models for visual vocabulary generation , 2012, Pattern Recognit..

[38]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[39]  Richard J Boys,et al.  A Bayesian Approach to DNA Sequence Segmentation , 2004, Biometrics.

[40]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .