A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences.

Predicting functional amino acid residues in silico is important for comparative genomics. In this paper, we focus on the issue of how to statistically identify cluster-specific amino acid residues that are related to the functional divergence after gene duplication. We approach this problem using a framework based on site-specific shift of amino acid property (type-II functional divergence), as opposed to site-specific shift of evolutionary rate (type-I functional divergence). An efficient statistical procedure is implemented to facilitate the development of phylogenomic database for cluster-specific residues of large-scale protein families. Our method has the following features: 1) statistical testing of the type-II functional divergence and 2) the site-specific Bayesian profile to measure how amino acid residues contribute to type-II (cluster-specific) functional divergence. Consequently, one may obtain the posterior probability for "functional" cluster-specific residues. Case studies are presented and indicate that radical cluster-specific residues are responsible for most of inferred type-II functional divergence, whereas conserved cluster-specific residues appear less than even those imperfect radical cluster-specific residues to this type of functional divergence.

[1]  Xiang Gao,et al.  SplitTester : software to identify domains responsible for functional divergence in protein family , 2005, BMC Bioinformatics.

[2]  David C. Jones,et al.  Combining protein evolution and secondary structure. , 1996, Molecular biology and evolution.

[3]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[4]  C. Sander,et al.  A method to predict functional residues in proteins , 1995, Nature Structural Biology.

[5]  J. Zhang,et al.  A simple method for estimating the parameter of substitution rate variation among sites. , 1997, Molecular biology and evolution.

[6]  C. Fraser,et al.  Phylogenomics: Intersection of Evolution and Genomics , 2003, Science.

[7]  P. Bork,et al.  Predicting functions from protein sequences—where are the bottlenecks? , 1998, Nature Genetics.

[8]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[9]  M M Miyamoto,et al.  A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[11]  Xun Gu,et al.  Functional divergence in protein (family) sequence evolution. , 2003 .

[12]  X. Gu,et al.  Maximum-likelihood approach for gene family evolution under functional divergence. , 2001, Molecular biology and evolution.

[13]  O. Lichtarge,et al.  Evolutionary Trace of G Protein-coupled Receptors Reveals Clusters of Residues That Determine Global and Class-specific Functions* , 2004, Journal of Biological Chemistry.

[14]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[15]  A. Dean,et al.  The structural basis of molecular adaptation. , 1998, Molecular biology and evolution.

[16]  Xun Gu,et al.  Predicting functional divergence in protein evolution by site-specific rate shifts. , 2002, Trends in biochemical sciences.

[17]  Xun Gu,et al.  DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family , 2002, Bioinform..

[18]  W. Atchley,et al.  Solving the protein sequence metric problem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Xun Gu,et al.  The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment , 1995, Journal of Molecular Evolution.

[20]  Jie Liang,et al.  Simplicial edge representation of protein structures and alpha contact potential with confidence measure , 2003, Proteins.

[21]  X. Gu,et al.  Identification of essential amino acid changes in paired domain evolution using a novel combination of evolutionary analysis and in vitro and in vivo studies. , 2002, Molecular biology and evolution.

[22]  I. King Jordan,et al.  Sequence and structural aspects of functional diversification in class I-mannosidase evolution , 2001, Bioinform..

[23]  D. Liberles,et al.  Subfunctionalization of duplicated genes as a transition state to neofunctionalization , 2005, BMC Evolutionary Biology.

[24]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[25]  Hervé Philippe,et al.  Functional divergence prediction from evolutionary analysis: a case study of vertebrate hemoglobin. , 2003, Molecular biology and evolution.

[26]  W. Li,et al.  Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites. , 1995, Molecular biology and evolution.

[27]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[28]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[29]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[30]  X. Gu,et al.  Statistical methods for testing functional divergence after gene duplication. , 1999, Molecular biology and evolution.

[31]  Hervé Philippe,et al.  The Root of the Tree of Life in the Light of the Covarion Model , 1999, Journal of Molecular Evolution.

[32]  W R Taylor,et al.  Coevolving protein residues: maximum likelihood identification and relationship to structure. , 1999, Journal of molecular biology.

[33]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[34]  R. Levy,et al.  Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.

[35]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[36]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[37]  Jie Liang,et al.  Estimation of amino acid residue substitution rates at local spatial regions and application in protein function inference: a Bayesian Monte Carlo approach. , 2006, Molecular biology and evolution.