Applications of Hidden Markov Models for Characterization of Homologous DNA Sequences with a Common Gene

Identifying and characterizing the structure in genome sequences is one of the principal challenges in modern molecular biology, and comparative genomics offers a powerful tool. In this paper, we introduce a hidden Markov model that allows a comparative analysis of multiple sequences related by a phylogenetic tree, and we present an efficient method for estimating the parameters of the model. The model integrates structure prediction methods for one sequence, statistical multiple alignment methods, and phylogenetic information. This unified model is particularly useful for a detailed characterization of DNA sequences with a common gene. We illustrate the model on a variety of homologous sequences.

[1]  Jill P. Mesirov,et al.  Human and mouse gene structure: comparative analysis and application to exon prediction , 2000, RECOMB '00.

[2]  Ian Korf,et al.  Integrating genomic homology into gene structure prediction , 2001, ISMB.

[3]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[4]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[5]  J. L. Jensen,et al.  GIBBS SAMPLER FOR STATISTICAL MULTIPLE ALIGNMENT , 2005 .

[6]  H. Robbins A Stochastic Approximation Method , 1951 .

[7]  Michael Ruogu Zhang,et al.  Statistical features of human exons and their flanking regions. , 1998, Human molecular genetics.

[8]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[9]  Jakob Skou Pedersen,et al.  Gene finding with a hidden Markov model of genome structure and evolution , 2003, Bioinform..

[10]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[11]  Ian Holmes,et al.  Evolutionary HMMs: a Bayesian approach to multiple alignment , 2001, Bioinform..

[12]  Jens Ledet Jensen,et al.  Recursions for statistical multiple alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Simon Cawley,et al.  Applications of generalized pair hidden Markov models to alignment and gene finding problems , 2001, J. Comput. Biol..

[14]  J. Felsenstein,et al.  An evolutionary model for maximum likelihood alignment of DNA sequences , 1991, Journal of Molecular Evolution.

[15]  Anders Krogh,et al.  Two Methods for Improving Performance of a HMM and their Application for Gene Finding , 1997, ISMB.

[16]  Richard Durbin,et al.  Comparative ab initio prediction of gene structures using pair HMMs , 2002, Bioinform..