PSeq-Gen: an application for the Monte Carlo simulation of protein sequence evolution along phylogenetic trees

PSeq-Gen will simulate the evolution of protein sequences along evolutionary trees following the procedures previously reported for the DNA sequence simulator Seq-Gen (Sequence-Generator, Rambaut and Grassly, 1997). Statistics calculated from these sequences can be used to give expectations under specific null hypotheses of protein evolution. This Monte Carlo simulation approach to testing hypotheses is often termed 'parametric bootstrapping' (see, for example, Efron, 1985; Huelsenbeck et al., 19%; Huelsenbeck and Rannala, 1997), and has many powerful applications, such as testing the molecular clock (Goldman, 1993), detecting recombination (Grassly and Holmes, 1997), and evaluating competing phylogenetic hypotheses (Hillis et al., 19%). Three common models of amino acid substitution are implemented (PAM: Dayhoff et al., 1978; JTT: Jones et al., 1992; mtREV: Adachi and Hasegawa, 1995). These models use instantaneous rate matrices derived from observed patterns of accepted point mutations for aligned nuclear (PAM, JTT) and mitochondrial (mtREV) genes. The amino acid frequencies found for these alignments are the default in each model, but it is also possible to specify their frequencies independently in the implementation here [following Adachi (1995), but see also Kishino et al. (1990)]. In addition, sitespecific rate heterogeneity following a gamma distribution is allowed (as described in Rambaut and Grassly, 1997; Yang, 1993). As with Seq-Gen, any number of trees may be read in and any number of data sets can be simulated for each tree, allowing large sets of replicate simulations to be created easily. Maximum-likelihood algorithms exist which will reconstruct phylogenies from protein sequences using the PAM, JTT and mtREV substitution models (e.g. Adachi and Hasegawa, 1995; Yang, 1996), and hence bootstrapping to obtain confidence limits about phylogenetic parameters is easily achieved.

[1]  B. Rannala,et al.  Phylogenetic methods come of age: testing hypotheses in an evolutionary context. , 1997, Science.

[2]  Andrew Rambaut,et al.  Bi-De: an application for simulating phylogenetic processes , 1996, Comput. Appl. Biosci..

[3]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[4]  足立 淳,et al.  Modeling of molecular evolution and maximum likelihood inference of molecular phylogeny , 1995 .

[5]  D. Hillis Approaches for Assessing Phylogenetic Accuracy , 1995 .

[6]  J. Huelsenbeck Performance of Phylogenetic Methods in Simulation , 1995 .

[7]  M. Schoniger,et al.  Simulating efficiently the evolution of DNA sequences , 1995, Comput. Appl. Biosci..

[8]  M. Nei,et al.  Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. , 1994, Molecular biology and evolution.

[9]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[10]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[11]  Theodore Garland,et al.  Phylogenetic Analysis of Covariance by Computer Simulation , 1993 .

[12]  J. Bull,et al.  EXPERIMENTAL MOLECULAR EVOLUTION OF BACTERIOPHAGE T7 , 1993, Evolution; international journal of organic evolution.

[13]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[14]  J. Bull,et al.  Experimental phylogenetics: generation of a known phylogeny. , 1992, Science.

[15]  B. Efron Bootstrap confidence intervals for a class of parametric problems , 1985 .

[16]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[17]  J. Felsenstein,et al.  A Hidden Markov Model approach to variation among sites in rate of evolution. , 1996, Molecular biology and evolution.

[18]  P. Lewis,et al.  Success of maximum likelihood phylogeny inference in the four-taxon case. , 1995, Molecular biology and evolution.

[19]  J. Adachi,et al.  MOLPHY, programs for molecular phylogenetics , 1992 .

[20]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[21]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .