A Coarse-to-Fine Approach to Computing the k-Best Viterbi Paths

The Hidden Markov Model (HMM) is a probabilistic model used widely in the fields of Bioinformatics and Speech Recognition. Efficient algorithms for solving the most common problems are well known, yet they all have a running time that is quadratic in the number of hidden states, which can be problematic for models with very large state spaces. The Viterbi algorithm is used to find the maximum likelihood hidden state sequence, and it has earlier been shown that a coarse-to-fine modification can significantly speed up this algorithm on some models. We propose combining work on a k-best version of Viterbi algorithm with the coarse-to-fine framework. This algorithm may be used to approximate the total likelihood of the model, or to evaluate the goodness of the Viterbi path on very large models.

[1]  K Karplus,et al.  Predicting protein structure using only sequence information , 1999, Proteins.

[2]  Yun S. Song,et al.  Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data , 2009, Bioinform..

[3]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[4]  Kurt Keutzer,et al.  Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors , 2008 .

[5]  A. Hobolth,et al.  Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach , 2009, Genetics.

[6]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[7]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[8]  Anders Albrechtsen,et al.  Relatedness mapping and tracts of relatedness for genome‐wide data in the presence of linkage disequilibrium , 2009, Genetic epidemiology.

[9]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[10]  Xue-wen Chen,et al.  Identification of genes involved in the same pathways using a Hidden Markov Model-based approach , 2009, Bioinform..

[11]  Christopher Raphael,et al.  Coarse-to-Fine Dynamic Programming , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[13]  Yi-Ping Phoebe Chen,et al.  An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy , 2007, Nucleic acids research.

[14]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[15]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[16]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[17]  Joshua Goodman,et al.  Global Thresholding and Multiple-Pass Parsing , 1997, EMNLP.

[18]  Christoph Neukirchen,et al.  Efficient search with posterior probability estimates in HMM-based speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[19]  Mark Gerstein,et al.  Bioinformatics Original Paper a Supervised Hidden Markov Model Framework for Efficiently Segmenting Tiling Array Data in Transcriptional and Chip-chip Experiments: Systematically Incorporating Validated Biological Knowledge , 2022 .

[20]  J. Wiebe Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference , 2000 .

[21]  Ajay N. Jain,et al.  Hidden Markov models approach to the analysis of array CGH data , 2004 .