Parameter Estimation in Pair‐hidden Markov Models

This paper deals with parameter estimation in pair-hidden Markov models. We first provide a rigorous formalism for these models and discuss possible definitions of likelihoods. The model is biologically motivated and therefore naturally leads to restrictions on the parameter space. Existence of two different information divergence rates is established and a divergence property is shown under additional assumptions. This yields consistency for the parameter in parametrization schemes for which the divergence property holds. Simulations illustrate different cases which are not covered by our results. Copyright 2006 Board of the Foundation of the Scandinavian Journal of Statistics..

[1]  J. Felsenstein,et al.  An evolutionary model for maximum likelihood alignment of DNA sequences , 1991, Journal of Molecular Evolution.

[2]  D. Politis,et al.  Statistical Estimation , 2022 .

[3]  I. Holmes,et al.  A "Long Indel" model for evolutionary sequence alignment. , 2003, Molecular biology and evolution.

[4]  J. Hein,et al.  Statistical alignment: computational properties, homology testing and goodness-of-fit. , 2000, Journal of molecular biology.

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  J. Kingman,et al.  The Ergodic Theory of Subadditive Stochastic Processes , 1968 .

[7]  Simon Cawley,et al.  Applications of generalized pair hidden Markov models to alignment and gene finding problems , 2001, J. Comput. Biol..

[8]  Vladimir I. Levenshtein,et al.  Efficient reconstruction of sequences , 2001, IEEE Trans. Inf. Theory.

[9]  Ian Holmes,et al.  Using evolutionary Expectation Maximization to estimate indel rates , 2005, Bioinform..

[10]  J. Felsenstein,et al.  Inching toward reality: An improved likelihood model of sequence evolution , 2004, Journal of Molecular Evolution.

[11]  Richard Durbin,et al.  Comparative ab initio prediction of gene structures using pair HMMs , 2002, Bioinform..

[12]  M. Miyamoto,et al.  Sequence alignments and pair hidden Markov models using evolutionary history. , 2003, Journal of molecular biology.

[13]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[14]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[15]  L. Sucheston On mixing and the zero-one law☆ , 1963 .

[16]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[17]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[18]  M. Bishop,et al.  Maximum likelihood alignment of DNA sequences. , 1986, Journal of molecular biology.

[19]  B. Leroux Maximum-likelihood estimation for hidden Markov models , 1992 .

[20]  Uwe Rösler,et al.  Convergence of the maximum a posteriori path estimator in hidden Markov models , 2002, IEEE Trans. Inf. Theory.

[21]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[22]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[23]  J. Doob Stochastic processes , 1953 .

[24]  David J. C. MacKay,et al.  Reliable communication over channels with insertions, deletions, and substitutions , 2001, IEEE Trans. Inf. Theory.

[25]  Dirk Metzler,et al.  Statistical alignment based on fragment insertion and deletion models , 2003, Bioinform..

[26]  Asger Hobolth,et al.  Applications of Hidden Markov Models for Characterization of Homologous DNA Sequences with a Common Gene , 2005, J. Comput. Biol..