Score Functions for Determining Regional Conservation in Two-Species Local Alignments

We construct several score functions for use in locating unusually conserved regions in a genomewide search of aligned DNA from two species. We test these functions on regions of the human genome aligned to the mouse genome. These score functions are derived from properties of neutrally evolving sites on the mouse and human genome and can be adjusted to the local background rate of conservation. The aim of these functions is to try to identify regions of the human genome that are conserved by evolutionary selection because they have an important function, rather than by chance. We use them to get a very rough estimate of the amount of DNA in the human genome that is under selection.

[1]  David Haussler,et al.  Scoring two-species local alignments to try to statistically separate neutrally evolving from selected DNA segments , 2003, RECOMB '03.

[2]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[3]  S. Hess,et al.  The influence of nearest neighbors on the rate and pattern of spontaneous point mutations , 1992, Journal of Molecular Evolution.

[4]  B. Morton The Influence of Neighboring Base Composition on Substitutions in Plant Chloroplast Coding Sequences , 1997 .

[5]  J. Felsenstein,et al.  A Hidden Markov Model approach to variation among sites in rate of evolution. , 1996, Molecular biology and evolution.

[6]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[7]  David Haussler,et al.  Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis , 2004, J. Comput. Biol..

[8]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[10]  P. Sharp,et al.  Chromosomal location effects on gene sequence evolution in mammals , 1999, Current Biology.

[11]  W. Miller,et al.  Distinguishing regulatory DNA from neutral sites. , 2003, Genome research.

[12]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[13]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[14]  D Haussler,et al.  The share of human genomic DNA under selection estimated from human-mouse genomic alignments. , 2003, Cold Spring Harbor symposia on quantitative biology.

[15]  David Haussler,et al.  Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. , 2003, Genome research.

[16]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[17]  B. Rannala,et al.  Phylogenetic methods come of age: testing hypotheses in an evolutionary context. , 1997, Science.

[18]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[19]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .