MAP segmentation in Bayesian hidden Markov models: a case study

We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data.

[1]  D. M. Titterington,et al.  VARIATIONAL BAYESIAN ANALYSIS FOR HIDDEN MARKOV MODELS , 2009 .

[2]  Valeria De Fonzo,et al.  Hidden Markov Models in Bioinformatics , 2007 .

[3]  J. Lember,et al.  ADJUSTED VITERBI TRAINING , 2004, Probability in the Engineering and Informational Sciences.

[4]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[5]  Thomas L. Griffiths,et al.  A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[6]  Yi Li,et al.  Bayesian Hidden Markov Modeling of Array CGH Data , 2008, Journal of the American Statistical Association.

[7]  Mark Steedman,et al.  Two Decades of Unsupervised POS Induction: How Far Have We Come? , 2010, EMNLP.

[8]  Ben Taskar,et al.  Posterior vs Parameter Sparsity in Latent Variable Models , 2009, NIPS.

[9]  Jean-Michel Marin,et al.  Bayesian Core: A Practical Approach to Computational Bayesian Statistics , 2010 .

[10]  D. Blei Bayesian Nonparametrics I , 2016 .

[11]  Richard J Boys,et al.  A Bayesian Approach to DNA Sequence Segmentation , 2004, Biometrics.

[12]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.

[13]  J. Corander,et al.  Optimal Viterbi Bayesian predictive classification for data from finite alphabets , 2013 .

[14]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[15]  Haikady N. Nagaraja,et al.  Inference in Hidden Markov Models , 2006, Technometrics.

[16]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[17]  Allou Samé,et al.  An online classification EM algorithm based on the mixture model , 2007, Stat. Comput..

[18]  Jianfeng Gao,et al.  A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.

[19]  Allou Samé,et al.  A classification EM algorithm for binned data , 2006, Comput. Stat. Data Anal..

[20]  Mark Johnson,et al.  Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[21]  J. Lember,et al.  Adjusted Viterbi training for hidden Markov models , 2007, 0709.2317.

[22]  Kristi Kuljus,et al.  Estimation of Viterbi path in Bayesian hidden Markov models , 2018, METRON.

[23]  Luigi Spezia Reversible jump and the label switching problem in hidden Markov models , 2009 .

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..