Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach

Biologically significant sites in a protein may be identified by contrasting the rates of synonymous (Ks) and non-synonymous (Ka) substitutions. This enables the inference of site-specific positive Darwinian selection and purifying selection. We present here Selecton version 2.2 (http://selecton.bioinfo.tau.ac.il), a web server which automatically calculates the ratio between Ka and Ks (ω) at each site of the protein. This ratio is graphically displayed on each site using a color-coding scheme, indicating either positive selection, purifying selection or lack of selection. Selecton implements an assembly of different evolutionary models, which allow for statistical testing of the hypothesis that a protein has undergone positive selection. Specifically, the recently developed mechanistic-empirical model is introduced, which takes into account the physicochemical properties of amino acids. Advanced options were introduced to allow maximal fine tuning of the server to the user's specific needs, including calculation of statistical support of the ω values, an advanced graphic display of the protein's 3-dimensional structure, use of different genetic codes and inputting of a pre-built phylogenetic tree. Selecton version 2.2 is an effective, user-friendly and freely available web server which implements up-to-date methods for computing site-specific selection forces, and the visualization of these forces on the protein's sequence and structure.

[1]  M. Pie,et al.  The influence of phylogenetic uncertainty on the detection of positive Darwinian selection. , 2006, Molecular biology and evolution.

[2]  T. Pupko,et al.  A combined empirical and mechanistic codon model. , 2006, Molecular biology and evolution.

[3]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[4]  Xun Gu,et al.  DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family , 2002, Bioinform..

[5]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[6]  A. Eyre-Walker Fundamentals of Molecular Evolution (2nd edn) , 2000, Heredity.

[7]  Ziheng Yang,et al.  The power of phylogenetic comparison in revealing protein function. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[9]  Richard J. Edwards,et al.  BADASP: predicting functional specificity in protein families using ancestral sequences , 2005, Bioinform..

[10]  W. Wong,et al.  Bayes empirical bayes inference of amino acid sites under positive selection. , 2005, Molecular biology and evolution.

[11]  Michael Emerman,et al.  Positive selection of primate TRIM5alpha identifies a critical species-specific retroviral restriction domain. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[13]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .

[14]  Adi Doron-Faigenboim,et al.  Selecton: a server for detecting evolutionary forces at a single amino-acid site , 2005, Bioinform..

[15]  Itay Mayrose,et al.  ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures , 2005, Nucleic Acids Res..

[16]  C. M. Owens,et al.  The cytoplasmic body component TRIM5α restricts HIV-1 infection in Old World monkeys , 2004, Nature.

[17]  Joaquín Dopazo,et al.  Phylemon: a suite of web tools for molecular evolution, phylogenetics and phylogenomics , 2007, Nucleic Acids Res..

[18]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[19]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[20]  Itay Mayrose,et al.  A Gamma mixture model better accounts for among site rate heterogeneity , 2005, ECCB/JBI.

[21]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[22]  Dan Graur,et al.  Fundamentals of Molecular Evolution, 2nd Edition , 2000 .

[23]  Nikolay A. Kolchanov,et al.  CRASP: a program for analysis of coordinated substitutions in multiple alignments of protein sequences , 2004, Nucleic Acids Res..

[24]  T. Pupko,et al.  Site-Specific Evolutionary Rate Inference: Taking Phylogenetic Uncertainty into Account , 2005, Journal of Molecular Evolution.

[25]  J. Huelsenbeck,et al.  Bayesian Estimation of Positively Selected Sites , 2004, Journal of Molecular Evolution.

[26]  Jonathan P. Stoye,et al.  A Single Amino Acid Change in the SPRY Domain of Human Trim5α Leads to HIV-1 Restriction , 2005, Current Biology.

[27]  Sergei L. Kosakovsky Pond,et al.  Not so different after all: a comparison of methods for detecting amino acid sites under selection. , 2005, Molecular biology and evolution.

[28]  P. Moitra,et al.  BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. , 2003, Experimental cell research.

[29]  A. Yang,et al.  Retrovirus resistance factors Ref1 and Lv1 are species-specific variants of TRIM5alpha. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  D. Mindell Fundamentals of molecular evolution , 1991 .

[31]  N. Ben-Tal,et al.  Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. , 2004, Molecular biology and evolution.

[32]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[33]  H. Akaike A new look at the statistical model identification , 1974 .

[34]  R. Nielsen,et al.  Pervasive adaptive evolution in mammalian fertilization proteins. , 2003, Molecular biology and evolution.

[35]  Han Liang,et al.  SWAKK: a web server for detecting positive selection in proteins using a sliding window substitution rate analysis , 2006, Nucleic Acids Res..

[36]  Jean L. Chang,et al.  Initial sequence of the chimpanzee genome and comparison with the human genome , 2005, Nature.

[37]  M. Nei,et al.  Variance and covariances of the numbers of synonymous and nonsynonymous substitutions per site. , 1994, Molecular biology and evolution.

[38]  Tal Pupko,et al.  A structural EM algorithm for phylogenetic inference , 2001, J. Comput. Biol..

[39]  Sergei L. Kosakovsky Pond,et al.  Datamonkey: rapid detection of selective pressure on individual sites of codon alignments , 2005, Bioinform..

[40]  Alessandro Guffanti,et al.  The tripartite motif family identifies cell compartments , 2001, The EMBO journal.

[41]  Z. Yang,et al.  Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. , 2001, Molecular biology and evolution.

[42]  Itay Mayrose,et al.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues , 2002, ISMB.

[43]  Ryan D. Hernandez,et al.  Natural selection on protein-coding genes in the human genome , 2005, Nature.

[44]  Takashi Miyata,et al.  Molecular evolution of mRNA: A method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application , 1980, Journal of Molecular Evolution.

[45]  P. Bieniasz,et al.  Restriction of multiple divergent retroviruses by Lv1 and Ref1 , 2003, The EMBO journal.