Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees

BackgroundIn several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking.ResultsThis is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.ConclusionsIndividual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion.ReviewersThis article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin.Open peer reviewReviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

[1]  A. von Haeseler,et al.  A stochastic model for the evolution of autocorrelated DNA sequences. , 1994, Molecular phylogenetics and evolution.

[2]  Martin Vingron,et al.  Modeling Amino Acid Replacement , 2000, J. Comput. Biol..

[3]  Thomas Dandekar,et al.  Synchronous visual analysis and editing of RNA sequence and secondary structure alignments using 4SALE , 2008, BMC Research Notes.

[4]  S. Carroll,et al.  More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy. , 2005, Molecular biology and evolution.

[5]  Thomas Dandekar,et al.  Homology modeling revealed more than 20,000 rRNA internal transcribed spacer 2 (ITS2) secondary structures. , 2005, RNA.

[6]  Catalina Aguilar,et al.  Phylogenetic reconstruction using secondary structures of Internal Transcribed Spacer 2 (ITS2, rDNA): finding the molecular and morphological gap in Caribbean gorgonian corals , 2007, BMC Evolutionary Biology.

[7]  O. Kandler,et al.  Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Huelsenbeck,et al.  Application and accuracy of molecular phylogenies. , 1994, Science.

[9]  J. Schultz,et al.  ITS2 sequence-structure analysis in phylogenetics: a how-to manual for molecular systematics. , 2009, Molecular phylogenetics and evolution.

[10]  Tobias Müller,et al.  The internal transcribed spacer 2 database—a web server for (not only) low level phylogenetic analyses , 2006, Nucleic Acids Res..

[11]  A. Coleman,et al.  The advantages of the ITS2 region of the nuclear rDNA cistron for analysis of phylogenetic relationships of insects: a Drosophila example. , 2004, Molecular phylogenetics and evolution.

[12]  Tobias Müller,et al.  The ITS2 Database II: homology modelling RNA structure for molecular systematics , 2007, Nucleic Acids Res..

[13]  Thomas Mailund,et al.  QDist-quartet distance between evolutionary trees , 2004, Bioinform..

[14]  A. Graybeal,et al.  Is it better to add taxa or characters to a difficult phylogenetic problem? , 1998, Systematic biology.

[15]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[16]  Annette W. Coleman,et al.  Pan-eukaryote ITS2 homologies revealed by RNA secondary structure , 2007, Nucleic acids research.

[17]  A. von Haeseler,et al.  Identifying site-specific substitution rates. , 2003, Molecular biology and evolution.

[18]  Zih E N G Ya N,et al.  On the Best Evolutionary Rate for Phylogenetic Analysis , 1998 .

[19]  Tobias Müller,et al.  ProfDistS: (profile-) distance based phylogeny on sequence - structure alignments , 2008, Bioinform..

[20]  Tobias Müller,et al.  ProfDist: a tool for the construction of large phylogenetic trees based on profile distances , 2005, Bioinform..

[21]  P. Lio’,et al.  Molecular phylogenetics: state-of-the-art methods for looking into the past. , 2001, Trends in genetics : TIG.

[22]  Oong,et al.  More Characters or More Taxa for a Robust Phylogeny — Case Study from the Coffee Family ( Rubiaceae ) , 2000 .

[23]  E. Tillier,et al.  High apparent rate of simultaneous compensatory base-pair substitutions in ribosomal RNA. , 1998, Genetics.

[24]  K. M. Sefc,et al.  Nuclear and mitochondrial data reveal different evolutionary processes in the Lake Tanganyika cichlid genus Tropheus , 2007, BMC Evolutionary Biology.

[25]  J. Bull,et al.  An Empirical Test of Bootstrapping as a Method for Assessing Confidence in Phylogenetic Analysis , 1993 .

[26]  Thomas Dandekar,et al.  5.8S-28S rRNA interaction and HMM-based ITS2 annotation. , 2009, Gene.

[27]  M. Harrington,et al.  Structural partitioning, paired-sites models and evolution of the ITS transcript in Syzygium and Myrtaceae. , 2007, Molecular phylogenetics and evolution.

[28]  B. Willis,et al.  The evolutionary history of the coral genus Acropora (Scleractinia, Cnidaria) based on a mitochondrial and a nuclear marker: reticulation, incomplete lineage sorting, or morphological convergence? , 2001, Molecular biology and evolution.

[29]  F. Lutzoni,et al.  Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. , 2003, Molecular biology and evolution.

[30]  Tobias Müller,et al.  4SALE – A tool for synchronous RNA sequence and secondary structure alignment and editing , 2006, BMC Bioinformatics.

[31]  M. Rattray,et al.  Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution. , 2002, Molecular biology and evolution.

[32]  K. Crandall,et al.  The Effect of Recombination on the Accuracy of Phylogeny Estimation , 2002, Journal of Molecular Evolution.

[33]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[34]  T. Britton,et al.  Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. , 2003, Systematic biology.

[35]  T. Dandekar,et al.  ITS2 data corroborate a monophyletic chlorophycean DO-group (Sphaeropleales) , 2008, BMC Evolutionary Biology.

[36]  A. Coleman,et al.  ITS2 is a double-edged tool for eukaryote evolutionary comparisons. , 2003, Trends in genetics : TIG.

[37]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[38]  R. Lawson,et al.  Snake phylogeny: evidence from nuclear and mitochondrial genes. , 2002, Molecular phylogenetics and evolution.