Protein evolution constraints and model-based techniques to study them.

There have been substantial improvements in statistical tools for assessing the evolutionary roles of mutation and natural selection from interspecific sequence data. The importance of having the rate at which a point mutation occurs depend on the DNA sequence at sites surrounding the mutation is now better appreciated and can be accommodated in probabilistic models of protein evolution. To quantify the evolutionary impact of some aspect of phenotype, one promising strategy is to develop a system for predicting phenotype from the DNA sequence and to then infer how the evolutionary rates of sequence change are affected by the predicted phenotypic consequences of the changes. Although statistical tools for characterizing protein evolution are improving, the list of candidate phenomena that can affect rates of protein evolution is long and the relative contributions of these phenomena are only beginning to be disentangled.

[1]  D. Haussler,et al.  Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. , 2003, Molecular biology and evolution.

[2]  István Miklós,et al.  Statistical Alignment: Recent Progress, New Applications, and Challenges , 2005 .

[3]  M. Suchard,et al.  Joint Bayesian estimation of alignment and phylogeny. , 2005, Systematic biology.

[4]  C. Wilke,et al.  A single determinant dominates the rate of yeast protein evolution. , 2006, Molecular biology and evolution.

[5]  J. Echave,et al.  Quaternary structure constraints on evolutionary sequence divergence. , 2006, Molecular biology and evolution.

[6]  J. L. Jensen,et al.  Probabilistic models of DNA sequence evolution with context dependent rates of substitution , 2000, Advances in Applied Probability.

[7]  M. Kimura,et al.  An introduction to population genetics theory , 1971 .

[8]  Simon Whelan,et al.  Estimating the Frequency of Events That Cause Multiple-Nucleotide Changes , 2004, Genetics.

[9]  H. Kishino,et al.  Estimation of Divergence Times from Molecular Sequence Data , 2005 .

[10]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[11]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[12]  P. Awadalla The evolutionary genomics of pathogen recombination , 2003, Nature Reviews Genetics.

[13]  J. L. Jensen,et al.  A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames. , 2001, Molecular biology and evolution.

[14]  A. E. Hirsh,et al.  Protein dispensability and rate of evolution , 2001, Nature.

[15]  Simon Whelan,et al.  Statistical Methods in Molecular Evolution , 2005 .

[16]  H. Philippe,et al.  Assessing site-interdependent phylogenetic models of sequence evolution. , 2006, Molecular biology and evolution.

[17]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[18]  Sudhir Kumar,et al.  Gene Expression Intensity Shapes Evolutionary Rates of the Proteins Encoded by the Vertebrate Genome , 2004, Genetics.

[19]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[20]  Frances H Arnold,et al.  Structural determinants of the rate of protein evolution in yeast. , 2006, Molecular biology and evolution.

[21]  Asger Hobolth,et al.  CpG + CpNpG analysis of protein-coding sequences from tomato. , 2006, Molecular biology and evolution.

[22]  Bernardo Lemos,et al.  Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. , 2005, Molecular biology and evolution.

[23]  J. Echave,et al.  Structural constraints and emergence of sequence patterns in protein evolution. , 2001, Molecular biology and evolution.

[24]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[25]  M. Lässig,et al.  Evolutionary population genetics of promoters: predicting binding sites and functional phylogenies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Michele Vendruscolo,et al.  A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank , 2006, BMC Evolutionary Biology.

[27]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[28]  M. Miyamoto,et al.  Using equilibrium frequencies in models of sequence evolution , 2005, BMC Evolutionary Biology.

[29]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[30]  C. Wilke,et al.  Why highly expressed proteins evolve slowly. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  A. E. Hirsh,et al.  The application of statistical physics to evolutionary biology. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Eric Vigoda,et al.  Heterogeneous Genomic Molecular Clocks in Primates , 2006, PLoS genetics.

[33]  J. Thorne,et al.  Dependence among sites in RNA evolution. , 2006, Molecular biology and evolution.

[34]  Asger Hobolth,et al.  Pseudo-Likelihood Analysis of Codon Substitution Models with Neighbor-Dependent Rates , 2005, J. Comput. Biol..

[35]  D. Bryant,et al.  Site interdependence attributed to tertiary structure in amino acid sequence evolution. , 2005, Gene.

[36]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[37]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[38]  Hirohisa Kishino,et al.  Population genetics without intraspecific data. , 2007, Molecular biology and evolution.

[39]  G. McVean,et al.  Estimating recombination rates from population-genetic data , 2003, Nature Reviews Genetics.

[40]  S. Ho,et al.  Relaxed Phylogenetics and Dating with Confidence , 2006, PLoS biology.

[41]  G. L. Hofacker,et al.  Stochastic traits of molecular evolution--acceptance of point mutations in native actin genes. , 1990, Journal of theoretical biology.

[42]  István Miklós,et al.  Bayesian coestimation of phylogeny and sequence alignment , 2005, BMC Bioinformatics.

[43]  P. Sharp,et al.  Evidence for a high frequency of simultaneous double-nucleotide substitutions. , 2000, Science.

[44]  R A Goldstein,et al.  Context-dependent optimal substitution matrices. , 1995, Protein engineering.

[45]  Philip J. Farabaugh,et al.  Molecular basis of base substitution hotspots in Escherichia coli , 1978, Nature.

[46]  David C. Jones,et al.  Assessing the impact of secondary structure and solvent accessibility on protein evolution. , 1998, Genetics.

[47]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[48]  Johannes Berg,et al.  Adaptive evolution of transcription factor binding sites , 2003, BMC Evolutionary Biology.

[49]  Stéphane Aris-Brosou,et al.  Determinants of adaptive evolution at the molecular level: the extended complexity hypothesis. , 2004, Molecular biology and evolution.

[50]  P. Green,et al.  Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  M. D. Topal,et al.  Complementary base pairing and the origin of substitution mutations , 1976, Nature.

[52]  María Silvina Fornasari,et al.  Site-specific amino acid replacement matrices from structurally constrained protein evolution simulations. , 2002, Molecular biology and evolution.