Estimation of Viterbi path in Bayesian hidden Markov models

The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum joint posterior probability. Hence it is also called the maximum a posteriori (MAP) path. For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applicable and several iterative methods can be used instead. We introduce a new EM-type algorithm for finding the MAP path and compare it with various other methods for finding the MAP path, including the variational Bayes approach and MCMC methods. Examples with simulated data are used to compare the performance of the methods. The main focus is on non-stochastic iterative methods and our results show that the best of those methods work as well or better than the best MCMC methods. Our results demonstrate that when the primary goal is segmentation, then it is more reasonable to perform segmentation directly by considering the transition and emission parameters as nuisance parameters.

[1]  Antonello Maruotti,et al.  A semiparametric approach to hidden Markov models under longitudinal observations , 2009, Stat. Comput..

[2]  Tobias Rydén,et al.  EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a computational perspective , 2008 .

[3]  Alexey Koloydenko,et al.  On Adjusted Viterbi Training , 2007 .

[4]  Jüri Lember,et al.  Adjusted Viterbi Training:a proof of concept , 2005 .

[5]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[6]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[7]  D. M. Titterington,et al.  VARIATIONAL BAYESIAN ANALYSIS FOR HIDDEN MARKOV MODELS , 2009 .

[8]  Jianfeng Gao,et al.  A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.

[9]  V. Šmídl,et al.  The Variational Bayes Method in Signal Processing , 2005 .

[10]  Mark Steedman,et al.  Two Decades of Unsupervised POS Induction: How Far Have We Come? , 2010, EMNLP.

[11]  Eric Moulines,et al.  Inference in Hidden Markov Models (Springer Series in Statistics) , 2005 .

[12]  T. Koski Hidden Markov Models for Bioinformatics , 2001 .

[13]  Jean-Michel Marin,et al.  Bayesian Core: A Practical Approach to Computational Bayesian Statistics , 2010 .

[14]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[15]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.

[16]  A. Koloydenko,et al.  Theory of Segmentation , 2011 .

[17]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[18]  T. Stephenson Image analysis , 1992, Nature.

[19]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[20]  Ben Taskar,et al.  Posterior vs Parameter Sparsity in Latent Variable Models , 2009, NIPS.

[21]  Mark Johnson,et al.  Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[22]  J. Lember,et al.  Adjusted Viterbi training for hidden Markov models , 2007, 0709.2317.

[23]  J. Corander,et al.  Optimal Viterbi Bayesian predictive classification for data from finite alphabets , 2013 .

[24]  Allou Samé,et al.  A classification EM algorithm for binned data , 2006, Comput. Stat. Data Anal..

[25]  S. L. Scott Bayesian Methods for Hidden Markov Models , 2002 .

[26]  C. Lawrence,et al.  Centroid estimation in discrete high-dimensional spaces with applications in biology , 2008, Proceedings of the National Academy of Sciences.

[27]  Christopher Yau,et al.  A decision-theoretic approach for segmental classification , 2010, 1007.4532.

[28]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[29]  Thomas L. Griffiths,et al.  A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[30]  G. McLachlan,et al.  The EM Algorithm and Extensions: Second Edition , 2008 .

[31]  Wojciech Pieczynski,et al.  Unsupervised Statistical Segmentation of Nonstationary Images Using Triplet Markov Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Leszek Wojnar,et al.  Image Analysis , 1998 .

[33]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[34]  Pierre Lanchantin,et al.  Unsupervised segmentation of triplet Markov chains hidden with long-memory noise , 2008, Signal Process..

[35]  渡邊 澄夫 Algebraic geometry and statistical learning theory , 2009 .

[36]  Gerhard Winkler,et al.  Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction , 2002 .

[37]  Allou Samé,et al.  An online classification EM algorithm based on the mixture model , 2007, Stat. Comput..

[38]  Jüri Lember,et al.  Bridging Viterbi and posterior decoding: a generalized risk approach to hidden path inference based on hidden Markov models , 2014, J. Mach. Learn. Res..

[39]  Antonello Maruotti,et al.  A mixed non‐homogeneous hidden Markov model for categorical data, with application to alcohol consumption , 2012, Statistics in medicine.

[40]  Julian Besag,et al.  An Introduction to Markov Chain Monte Carlo Methods , 2004 .

[41]  Emmanuel Monfrini,et al.  Assessing the segmentation performance of pairwise and triplet Markov models , 2018, Signal Process..

[42]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[43]  Vincent Mazet,et al.  Oriented Triplet Markov Fields , 2018, Pattern Recognit. Lett..