An improved statistical method for detecting heterotachy in nucleotide sequences.

The principle of heterotachy states that the substitution rate of sites in a gene can change through time. In this article, we propose a powerful statistical test to detect sites that evolve according to the process of heterotachy. We apply this test to an alignment of 1289 eukaryotic rRNA molecules to 1) determine how widespread the phenomenon of heterotachy is in ribosomal RNA, 2) to test whether these heterotachous sites are nonrandomly distributed, that is, linked to secondary structure features of ribosomal RNA, and 3) to determine the impact of heterotachous sites on the bootstrap support of monophyletic groupings. Our study revealed that with 21 monophyletic taxa, approximately two-thirds of the sites in the considered set of sequences is heterotachous. Although the detected heterotachous sites do not appear bound to specific structural features of the small subunit rRNA, their presence is shown to have a large beneficial influence on the bootstrap support of monophyletic groups. Using extensive testing, we show that this may not be due to heterotachy itself but merely due to the increased substitution rate at the detected heterotachous sites.

[1]  X. Gu,et al.  Statistical methods for testing functional divergence after gene duplication. , 1999, Molecular biology and evolution.

[2]  M. Steel,et al.  A tale of two processes. , 2005, Systematic biology.

[3]  M. Steel,et al.  A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. , 1998, Molecular biology and evolution.

[4]  Edward Susko,et al.  Likelihood, parsimony, and heterogeneous evolution. , 2005, Molecular biology and evolution.

[5]  Y Van de Peer,et al.  A quantitative map of nucleotide substitution rates in bacterial rRNA. , 1996, Nucleic acids research.

[6]  Bryan Kolaczkowski,et al.  Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous , 2004, Nature.

[7]  Hervé Philippe,et al.  The Root of the Tree of Life in the Light of the Covarion Model , 1999, Journal of Molecular Evolution.

[8]  M Steel,et al.  Invariable sites models and their use in phylogeny reconstruction. , 2000, Systematic biology.

[9]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[10]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  A. von Haeseler,et al.  A stochastic model for the evolution of autocorrelated DNA sequences. , 1994, Molecular phylogenetics and evolution.

[13]  D. Hoyle,et al.  RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods. , 2001, Genetics.

[14]  Y. Peer,et al.  Microsporidia: accumulating molecular evidence that a group of amitochondriate and suspectedly primitive eukaryotes are just curious fungi. , 2000, Gene.

[15]  Guy Perrière,et al.  The European ribosomal RNA database , 2004, Nucleic Acids Res..

[16]  B. Müller-Hill,et al.  On the conservation of protein sequences in evolution. , 2000, Trends in biochemical sciences.

[17]  Edward Susko,et al.  Covarion shifts cause a long-branch attraction artifact that unites microsporidia and archaebacteria in EF-1alpha phylogenies. , 2004, Molecular biology and evolution.

[18]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[19]  Andrew Rambaut,et al.  Heterotachy and tree building: a case study with plastids and eubacteria. , 2006, Molecular biology and evolution.

[20]  E. Tillier,et al.  Neighbor Joining and Maximum Likelihood with RNA Sequences: Addressing the Interdependence of Sites , 1995 .

[21]  H Philippe,et al.  Molecular phylogeny: pitfalls and progress. , 2000, International microbiology : the official journal of the Spanish Society for Microbiology.

[22]  John D. Storey,et al.  Statistical Significance for Genome-Wide Studies , 2003 .

[23]  N. Galtier,et al.  Maximum-likelihood phylogenetic analysis under a covarion-like model. , 2001, Molecular biology and evolution.

[24]  R W Doerge,et al.  Accounting for Variability in the Use of Permutation Testing to Detect Quantitative Trait Loci , 2000, Biometrics.

[25]  H Philippe,et al.  On the conservation of protein sequences in evolution. , 2001, Trends in biochemical sciences.

[26]  P. Higgs RNA secondary structure: physical and computational aspects , 2000, Quarterly Reviews of Biophysics.

[27]  Y Van de Peer,et al.  Substitution rate calibration of small subunit ribosomal RNA identifies chlorarachniophyte endosymbionts as remnants of green algae. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[29]  H. Philippe,et al.  Heterotachy, an important process of protein evolution. , 2002, Molecular biology and evolution.

[30]  G. Olsen,et al.  Earliest phylogenetic branchings: comparing rRNA-based evolutionary trees inferred with various techniques. , 1987, Cold Spring Harbor symposia on quantitative biology.

[31]  W. Fitch,et al.  An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution , 1970, Biochemical Genetics.

[32]  Y Van de Peer,et al.  Distribution of substitution rates and location of insertion sites in the tertiary structure of ribosomal RNA. , 2001, Nucleic acids research.

[33]  J. Huelsenbeck Testing a covariotide model of DNA substitution. , 2002, Molecular biology and evolution.

[34]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[35]  H Philippe,et al.  Phylogeny of eukaryotes based on ribosomal RNA: long-branch attraction and models of sequence evolution. , 2000, Molecular biology and evolution.

[36]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[37]  D. Roff,et al.  The statistical analysis of mitochondrial DNA polymorphisms: chi 2 and the problem of small samples. , 1989, Molecular biology and evolution.

[38]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[39]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[40]  Mike Steel,et al.  Should phylogenetic models be trying to "fit an elephant"? , 2005, Trends in genetics : TIG.

[41]  M. Steel,et al.  Modeling the covarion hypothesis of nucleotide substitution. , 1998, Mathematical biosciences.

[42]  An Empirical Analysis of mt 16S rRNA Covarion-Like Evolution in Insects: Site-Specific Rate Variation Is Clustered and Frequently Detected , 2002, Journal of Molecular Evolution.

[43]  D Penny,et al.  Evolution of chlorophyll and bacteriochlorophyll: the problem of invariant sites in sequence analysis. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[44]  W. M. Fitch,et al.  Rate of change of concomitantly variable codons , 2005, Journal of Molecular Evolution.

[45]  E. Tillier,et al.  High apparent rate of simultaneous compensatory base-pair substitutions in ribosomal RNA. , 1998, Genetics.