How much data are needed to resolve a difficult phylogeny?: case study in Lamiales.

Reconstructing phylogeny is a crucial target of contemporary biology, now commonly approached through computerized analysis of genetic sequence data. In angiosperms, despite recent progress at the ordinal level, many relationships between families remain unclear. Here we take a case study from Lamiales, an angiosperm order in which interfamilial relationships have so far proved particularly problematic. We examine the effect of changing one factor-the quantity of sequence data analyzed-on phylogeny reconstruction in this group. We use simulation to estimate a priori the sequence data that would be needed to resolve an accurate, supported phylogeny of Lamiales. We investigate the effect of increasing the length of sequence data analyzed, the rate of substitution in the sequences used, and of combining gene partitions. This method could be a valuable technique for planning systematic investigations in other problematic groups. Our results suggest that increasing sequence length is a better way to improve support, resolution, and accuracy than employing sequences with a faster substitution rate. Indeed, the latter may in some cases have detrimental effects on phylogeny reconstruction. Further molecular sequencing-of at least 10,000 bp-should result in a fully resolved and supported phylogeny of Lamiales, but at present the problematic aspects of this tree model remain.

[1]  E. Pahlich,et al.  A rapid DNA isolation procedure for small quantities of fresh leaf tissue , 1980 .

[2]  D. Soltis,et al.  Phylogenetic Analysis of Asterids Based on Sequences of Four Genes , 2001 .

[3]  Claude W. dePamphilis,et al.  The evolution of parasitism in Scrophulariaceae/Orobanchaceae: plastid gene sequences refute an evolutionary transition series. , 1999 .

[4]  M. Miyamoto,et al.  Phylogenetic Analysis of DNA Sequences , 1991 .

[5]  L. Prendini,et al.  Species or supraspecific taxa as terminals in cladistic analysis? Groundplans versus exemplars revisited. , 2001, Systematic biology.

[6]  Junhyong Kim,et al.  Large-scale phylogenies and measuring the performance of phylogenetic estimators. , 1998, Systematic biology.

[7]  Apgii An update of the angiosperm phylogeny group classification for the orders and families of flowering plants : APGII , 2003 .

[8]  J. Huelsenbeck Performance of Phylogenetic Methods in Simulation , 1995 .

[9]  A. Schwarzbach,et al.  Phylogenetic Relationships of the Mangrove Family Avicenniaceae Based on Chloroplast and Nuclear Ribosomal DNA Sequences , 2009 .

[10]  H. Shaffer,et al.  Troubleshooting Molecular Phylogenetic Analyses , 2002 .

[11]  Terry Gaasterland,et al.  The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  M Fishbein,et al.  Phylogeny of Saxifragales (angiosperms, eudicots): analysis of a rapid, ancient radiation. , 2001, Systematic biology.

[13]  Donald H. Colless,et al.  Congruence Between Morphometric and Allozyme Data for Menidia Species: A Reappraisal , 1980 .

[14]  B. Hall,et al.  Phylogenetic relationships among ascomycetes: evidence from an RNA polymerse II subunit. , 1999, Molecular biology and evolution.

[15]  M. Källersjö,et al.  Phylogenetics of asterids based on 3 coding and 3 non-coding chloroplast DNA markers and the utility of non-coding DNA at higher taxonomic levels. , 2002, Molecular phylogenetics and evolution.

[16]  Phylogenetic relationships of the enigmatic angiosperm family Podostemaceae inferred from 18S rDNA and rbcL sequence data. , 1999, Molecular phylogenetics and evolution.

[17]  W. Atchley,et al.  Separation of phylogenetic and functional associations in biological sequences by using the parametric bootstrap. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Y. Yamazaki,et al.  Whole chloroplast genome comparison of rice, maize, and wheat: implications for chloroplast gene diversification and phylogeny of cereals. , 2002, Molecular biology and evolution.

[19]  D. Hillis Approaches for Assessing Phylogenetic Accuracy , 1995 .

[20]  Michael J Sanderson,et al.  The challenge of constructing large phylogenetic trees. , 2003, Trends in plant science.

[21]  D. Soltis,et al.  Relationships within Cornales and circumscription of Cornaceae-matK and rbcL sequence data and effects of outgroups and long branches. , 2002, Molecular phylogenetics and evolution.

[22]  Anne Chenuil,et al.  Can the Cambrian explosion be inferred through molecular phylogeny , 1994 .

[23]  C. Neinhuis,et al.  Angiosperm phylogeny based on matK sequence information. , 2003, American journal of botany.

[24]  C. dePamphilis,et al.  Disintegration of the scrophulariaceae. , 2001, American journal of botany.

[25]  Lester C. Loschky,et al.  What Is the Relationship , 1994 .

[26]  W. Brown,et al.  EVOLUTION OF ANIMAL MITOCHONDRIAL DNA: RELEVANCE FOR POPULATION BIOLOGY AND SYSTEMATICS , 1987 .

[27]  D. Yeates GROUNDPLANS AND EXEMPLARS: PATHS TO THE TREE OF LIFE , 1995, Cladistics : the international journal of the Willi Hennig Society.

[28]  V. Goremykin,et al.  Analysis of the Amborella trichopoda chloroplast genome sequence suggests that amborella is not a basal angiosperm. , 2003, Molecular biology and evolution.

[29]  H Philippe,et al.  How many nucleotides are required to resolve a phylogenetic problem? The use of a new statistical method applicable to available sequences. , 1994, Molecular phylogenetics and evolution.

[30]  D. Maddison The discovery and importance of multiple islands of most , 1991 .

[31]  D. Hillis,et al.  Taxonomic sampling, phylogenetic accuracy, and investigator bias. , 1998, Systematic biology.

[32]  K. Harada,et al.  Phylogenetic Relationships of Diploxylon Pines (Subgenus Pinus) Based on Plastid Sequence Data , 2002, International Journal of Plant Sciences.

[33]  J. Lundberg,et al.  An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants : APG II THE ANGIOSPERM PHYLOGENY GROUP * , 2003 .

[34]  James F. Smith,et al.  Tribal Relationships in the Gesneriaceae: Evidence from DNA Sequences of the Chloroplast Gene ndh F , 1997 .

[35]  R. Olmstead,et al.  PHYLOGENETIC ANALYSIS OF BIGNONIACEAE BASED ON THE CPDNA GENE SEQUENCES RBCL AND NDHF , 1999 .

[36]  J. C. Regier,et al.  More taxa or more characters revisited: combining data from nuclear protein-encoding genes for phylogenetic analyses of Noctuoidea (Insecta: Lepidoptera). , 2000, Systematic biology.

[37]  R. Olmstead,et al.  A simulation study of reduced tree-search effort in bootstrap resampling analysis. , 2000, Systematic biology.

[38]  D. Hillis,et al.  Molecular Versus Morphological Approaches to Systematics , 1987 .

[39]  Hervé Philippe,et al.  Early–branching or fast–evolving eukaryotes? An answer based on slowly evolving positions , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[40]  S. Renner,et al.  What Is the Relationship among Hernandiaceae, Lauraceae, and Monimiaceae, and Why Is This Question So Difficult to Answer? , 2000, International Journal of Plant Sciences.

[41]  M. Källersjö,et al.  Phylogenetic relationships in the order Ericales s.l.: analyses of molecular data from five genes from the plastid and mitochondrial genomes. , 2002, American journal of botany.

[42]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[43]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[44]  F. Sperling,et al.  Interaction of process partitions in phylogenetic analysis: an example from the swallowtail butterfly genus Papilio. , 1999, Molecular biology and evolution.

[45]  David Posada,et al.  MODELTEST: testing the model of DNA substitution , 1998, Bioinform..

[46]  D. Hillis Inferring complex phytogenies , 1996, Nature.

[47]  Derrick J. Zwickl,et al.  Increased taxon sampling is advantageous for phylogenetic inference. , 2002, Systematic biology.

[48]  MOLECULAR AND MORPHOLOGICAL DATA PROVIDE PHYLOGENETIC RESOLUTION AT DIFFERENT HIERARCHICAL LEVELS IN ANDIRA , 1996 .

[49]  R. Olmstead,et al.  Redefining Phrymaceae: the placement of Mimulus, tribe Mimuleae, and Phryma. , 2002, American journal of botany.

[50]  K. Winka,et al.  Ribosomal DNA and resolution of branching order among the ascomycota: how many nucleotides are enough? , 2000, Molecular phylogenetics and evolution.

[51]  Z. Yang On the best evolutionary rate for phylogenetic analysis. , 1998, Systematic biology.

[52]  R. Olmstead,et al.  Phylogeny of Poaceae subfamily Pooideae based on chloroplast ndhF gene sequences. , 1997, Molecular phylogenetics and evolution.

[53]  Derrick J. Zwickl,et al.  Is sparse taxon sampling a problem for phylogenetic inference? , 2003, Systematic biology.

[54]  D. Soltis,et al.  Clarification of the relationship beteen Apiaceae and Araliaceae based on matK and rbcL sequence data. , 1997, American journal of botany.

[55]  R. Olmstead,et al.  Phylogeny inLabiatae s. l., inferred from cpDNA sequences , 1998, Plant Systematics and Evolution.

[56]  Joaquin Dopazo,et al.  Monte Carlo simulation in phylogenies: An application to test the constancy of evolutionary rates , 1994, Journal of Molecular Evolution.

[57]  Joan D. Ferraris,et al.  Molecular Zoology: Advances, Strategies, and Protocols , 1997 .

[58]  Mark P. Simmons,et al.  Gaps as characters in sequence-based phylogenetic analyses. , 2000, Systematic biology.

[59]  S. Graham,et al.  Phylogenetic congruence and discordance among one morphological and three molecular data sets from Pontederiaceae. , 1998, Systematic biology.

[60]  Richard G. Olmstead,et al.  HIGHER-LEVEL SYSTEMATICS OF ACANTHACEAE DETERMINED BY CHLOROPLAST DNA SEQUENCES , 1995 .

[61]  D. Soltis,et al.  Inferring complex phylogenies using parsimony: an empirical approach using three large DNA data sets for angiosperms. , 1998, Systematic biology.

[62]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[63]  M. Källersjö,et al.  Maesaceae, a new primuloid family in the order Ericales s.l , 2000 .

[64]  M. Chase,et al.  Higher-level classification in the angiosperms: new insights from the perspective of DNA sequence data , 2000 .

[65]  C. Bult,et al.  TESTING SIGNIFICANCE OF INCONGRUENCE , 1994 .

[66]  A. Graybeal,et al.  Is it better to add taxa or characters to a difficult phylogenetic problem? , 1998, Systematic biology.

[67]  Doolittle Wf Phylogenetic Classification and the Universal Tree , 1999 .

[68]  Elizabeth A. Kellogg,et al.  An ordinal classification for the families of flowering plants , 1998 .