Improved transcriptome sampling pinpoints 26 paleopolyploidy events in Caryophyllales, including two paleo-allopolyploidy events

Studies of the macroevolutionary legacy of paleopolyploidy are limited by an incomplete sampling of these events across the tree of life. To better locate and understand these events, we need comprehensive taxonomic sampling as well as homology inference methods that accurately reconstruct the frequency and location of gene duplications. We assembled a dataset of transcriptomes and genomes from 169 species in Caryophyllales, of which 43 were newly generated for this study, representing one of the densest sampled genomic-scale datasets yet available. We carried out phylogenomic analyses using a modified phylome strategy to reconstruct the species tree. We mapped phylogenetic distribution of paleopolyploidy events by both tree-based and distance-based methods, and explicitly tested scenarios for paleo-allopolyploidy. We identified twenty-six paleopolyploidy events distributed throughout Caryophyllales, and using novel techniques inferred two to be paleo-allopolyploidy. Through dense phylogenomic sampling, we show the propensity of paleo-polyploidy in the clade Caryophyllales. We also provide the first method for utilizing transcriptome data to detect paleo-allopolyploidy, which is important as it may have different macro-evolutionary implications compared to paleo-autopolyploidy.

[1]  Matthew W. Hahn,et al.  Gene-tree reconciliation with MUL-trees to resolve polyploidy events , 2016, bioRxiv.

[2]  Stephen A. Smith,et al.  Disparity, Diversity, and Duplications in the Caryophyllales , 2017, bioRxiv.

[3]  Stephen A. Smith,et al.  Widespread paleopolyploidy, gene tree conflict, and recalcitrant relationships among the carnivorous Caryophyllales. , 2017, American journal of botany.

[4]  Michael S. Barker,et al.  Diverse genome organization following 13 independent mesopolyploid events in Brassicaceae contrasts with convergent patterns of gene retention , 2017, bioRxiv.

[5]  A. Rokas,et al.  Contentious relationships in phylogenomic studies can be driven by a handful of genes , 2017, Nature Ecology &Evolution.

[6]  Stephen A. Smith,et al.  Site and gene-wise likelihoods unmask influential outliers in phylogenomic analyses , 2017 .

[7]  Stephen A. Smith,et al.  An efficient field and laboratory workflow for plant phylotranscriptomic projects1 , 2017, Applications in Plant Sciences.

[8]  J. Lundberg,et al.  Genome-wide interrogation advances resolution of recalcitrant groups in the tree of life , 2017, Nature Ecology &Evolution.

[9]  Yi Hu,et al.  Evolution of Rosaceae Fruit Types Based on Nuclear Phylogeny in the Context of Geological Times and Genome Duplication , 2016, Molecular biology and evolution.

[10]  Matthew W. Hahn,et al.  Gene-tree reconciliation with MUL-trees to resolve polyploidy events , 2016, bioRxiv.

[11]  P. Arsénio,et al.  Biogeographical, ecological and ploidy variation in related asexual and sexual Limonium taxa (Plumbaginaceae) , 2016 .

[12]  Y. Hu,et al.  Multiple Polyploidization Events across Asteraceae with Two Nested Events in the Early History Revealed by Nuclear Phylogenomics , 2016, Molecular biology and evolution.

[13]  A. Larsson,et al.  Phylogeny and generic delimitation in Molluginaceae, new pigment data in Caryophyllales, and the new family Corbichoniaceae , 2016 .

[14]  Michael S. Barker,et al.  Most Compositae (Asteraceae) are descendants of a paleohexaploid and all share a paleotetraploid ancestor with the Calyceraceae , 2016, bioRxiv.

[15]  E. Kellogg Has the connection between polyploidy and diversification actually been tested? , 2016, Current opinion in plant biology.

[16]  Pamela S Soltis,et al.  Ancient WGD events as drivers of key innovations in angiosperms. , 2016, Current opinion in plant biology.

[17]  T. Slotte,et al.  Genomic legacies of the progenitors and the evolutionary consequences of allopolyploidy. , 2016, Current opinion in plant biology.

[18]  Rolf Lohaus,et al.  Of dups and dinos: evolution at the K/Pg boundary. , 2016, Current opinion in plant biology.

[19]  A. Stamatakis,et al.  Computing the Internode Certainty and Related Measures from Partial Gene Trees , 2015, bioRxiv.

[20]  D. Soltis,et al.  Polyploidy and genome evolution in plants. , 2015, Current opinion in genetics & development.

[21]  Michael S. Barker,et al.  Early genome duplications in conifers and other seed plants , 2015, Science Advances.

[22]  S. Kelly,et al.  OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy , 2015, Genome Biology.

[23]  Stephen A. Smith,et al.  Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants , 2015, BMC Evolutionary Biology.

[24]  C. Lindqvist,et al.  Untangling reticulate evolutionary relationships among New World and Hawaiian mints (Stachydeae, Lamiaceae). , 2015, Molecular phylogenetics and evolution.

[25]  David C. Tank,et al.  Nested radiations and the pulse of angiosperm diversification: increased diversification rates often follow whole genome duplications. , 2015, The New phytologist.

[26]  Tandy J. Warnow,et al.  ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes , 2015, Bioinform..

[27]  G. Wong,et al.  Lineage-specific gene radiations underlie the evolution of novel betalain pigmentation in Caryophyllales , 2015, The New phytologist.

[28]  Yinlong Xie,et al.  Dissecting Molecular Evolution in the Highly Diverse Plant Clade Caryophyllales Using Transcriptome Sequencing , 2015, Molecular biology and evolution.

[29]  D. Soltis,et al.  Patterns of chromosomal variation in natural populations of the neoallotetraploid Tragopogon mirus (Asteraceae) , 2014, Heredity.

[30]  Bengt Oxelman,et al.  From Gene Trees to a Dated Allopolyploid Network: Insights from the Angiosperm Genus Viola (Violaceae) , 2014, Systematic biology.

[31]  Michael S. Barker,et al.  The butterfly plant arms-race escalated by gene and genome duplications , 2014, Proceedings of the National Academy of Sciences.

[32]  C. N. Stewart,et al.  Multiple polyploidy events in the early radiation of nodulating and nonnodulating legumes. , 2015, Molecular biology and evolution.

[33]  John G. Hodge,et al.  Allopolyploidy, diversification, and the Miocene grassland expansion , 2014, Proceedings of the National Academy of Sciences.

[34]  Stephen A. Smith,et al.  Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics , 2014, Molecular biology and evolution.

[35]  Tandy J. Warnow,et al.  ASTRAL: genome-scale coalescent-based species tree estimation , 2014, Bioinform..

[36]  Steven Maere,et al.  Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[37]  Guy Baele,et al.  Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary , 2014, Genome research.

[38]  Haibao Tang,et al.  Integrated Syntenic and Phylogenomic Analyses Reveal an Ancient Genome Duplication in Monocots[W] , 2014, Plant Cell.

[39]  I. Mayrose,et al.  ChromEvol: assessing the pattern of chromosome number evolution and the inference of polyploidy along a phylogeny. , 2014, Molecular biology and evolution.

[40]  Alexandros Stamatakis,et al.  Novel information theory-based measures for quantifying incongruence among phylogenetic trees. , 2014, Molecular biology and evolution.

[41]  J. Willis,et al.  Comparative linkage maps suggest that fission, not polyploidy, underlies near-doubling of chromosome number within monkeyflowers (Mimulus; Phrymaceae) , 2014, Heredity.

[42]  Tandy J. Warnow,et al.  PASTA: Ultra-Large Multiple Sequence Alignment , 2014, RECOMB.

[43]  Alexander Goesmann,et al.  The genome of the recently domesticated crop plant sugar beet (Beta vulgaris) , 2013, Nature.

[44]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[45]  Y. Ozeki,et al.  Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.) , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.

[46]  E. Edwards,et al.  Repeated Origin of Three-Dimensional Leaf Venation Releases Constraints on the Evolution of Succulence in Plants , 2013, Current Biology.

[47]  K. Kron,et al.  Age Estimates for the Buckwheat Family Polygonaceae Based on Sequence Data Calibrated by Fossils and with a Focus on the Amphi-Pacific Muehlenbeckia , 2013, PloS one.

[48]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[49]  Bengt Oxelman,et al.  Statistical inference of allopolyploid species networks in the presence of incomplete lineage sorting. , 2012, Systematic biology.

[50]  Kevin Vanneste,et al.  Inference of genome duplications from age distributions revisited. , 2013, Molecular biology and evolution.

[51]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[52]  E. Edwards,et al.  Angiosperm Responses to a Low-CO2 World: CAM and C4 Photosynthesis as Parallel Evolutionary Trajectories , 2012, International Journal of Plant Sciences.

[53]  Hans Lehrach,et al.  Palaeohexaploid ancestry for Caryophyllales inferred from extensive gene-based physical and genetic mapping of the sugar beet genome (Beta vulgaris). , 2012, The Plant journal : for cell and molecular biology.

[54]  Bengt Oxelman,et al.  Inferring Species Networks from Gene Trees in High-Polyploid North American and Hawaiian Violets (Viola, Violaceae) , 2011, Systematic biology.

[55]  Denis Thieffry,et al.  Bacterial Molecular Networks: Methods and Protocols, 804 , 2012 .

[56]  Stijn van Dongen,et al.  Using MCL to extract clusters from networks. , 2012, Methods in molecular biology.

[57]  D. Soltis,et al.  Polyploidy and Genome Evolution , 2012, Springer Berlin Heidelberg.

[58]  Yeting Zhang,et al.  A genome triplication associated with early diversification of the core eudicots , 2012, Genome Biology.

[59]  Itay Mayrose,et al.  Recently Formed Polyploid Plants Diversify at Lower Rates Supporting Online Material , 2022 .

[60]  Ya Yang,et al.  Phylogenetics of the Chamaesyce clade (Euphorbia, Euphorbiaceae): reticulate evolution and long-distance dispersal in a prominent C4 lineage. , 2011, American journal of botany.

[61]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[62]  Claude W. dePamphilis,et al.  Ancestral polyploidy in seed plants and angiosperms , 2011, Nature.

[63]  U. Eggli,et al.  Contemporaneous and recent radiations of the world's major succulent plant lineages , 2011, Proceedings of the National Academy of Sciences of the United States of America.

[64]  Salvador Capella-Gutiérrez,et al.  PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions , 2010, Nucleic Acids Res..

[65]  Ari Löytynoja,et al.  webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser , 2010, BMC Bioinformatics.

[66]  D. Soltis,et al.  T HE AGE AND DIVERSIFICATION OF THE ANGIOSPERMS RE - REVISITED 1 , 2010 .

[67]  Norman A. Douglas,et al.  A new tribal classification of Nyctaginaceae , 2010 .

[68]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[69]  J. G. Burleigh,et al.  Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots , 2010, Proceedings of the National Academy of Sciences.

[70]  Michael S. Barker,et al.  Probabilistic models of chromosome number evolution and the inference of polyploidy. , 2010, Systematic biology.

[71]  U. Eggli,et al.  Disintegrating Portulacaceae: a new familial classification of the suborder Portulacineae (Caryophyllales) based on molecular and morphological data , 2010 .

[72]  K. Müller,et al.  Caryophyllales phylogenetics: disentangling Phytolaccaceae and Molluginaceae and description of Microteaceae as a new isolated family , 2009 .

[73]  Itay Mayrose,et al.  The frequency of polyploid speciation in vascular plants , 2009, Proceedings of the National Academy of Sciences.

[74]  D. Soltis,et al.  Phylogeny of the Caryophyllales Sensu Lato: Revisiting Hypotheses on Pollination Biology and Perianth Differentiation in the Core Caryophyllales , 2009, International Journal of Plant Sciences.

[75]  Katharina T. Huber,et al.  PADRE: a package for analyzing and displaying reticulate evolution , 2009, Bioinform..

[76]  M. Kapralov,et al.  Evolution of Genome size in Hawaiian Endemic Genus Schiedea (Caryophyllaceae) , 2009, Tropical Plant Biology.

[77]  Steven Maere,et al.  Plants with double genomes might have had a better chance to survive the Cretaceous–Tertiary extinction event , 2009, Proceedings of the National Academy of Sciences.

[78]  Ingo Ebersberger,et al.  HaMStR: Profile hidden markov model based search for orthologs in ESTs , 2009, BMC Evolutionary Biology.

[79]  Matthew B. Ogburn,et al.  Variations On A Theme: Repeated Evolution Of Succulent Life Forms In the Portulacineae (Caryophyllales) , 2008 .

[80]  Tae-Kun Seo Calculating bootstrap probabilities of phylogeny using multilocus sequence data. , 2008, Molecular biology and evolution.

[81]  Casey W. Dunn,et al.  Phyutility: a phyloinformatics tool for trees, alignments and molecular data , 2008, Bioinform..

[82]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[83]  P. Manos,et al.  Molecular phylogeny of Nyctaginaceae: taxonomy, biogeography, and characters associated with a radiation of xerophytic genera in North America. , 2007, American journal of botany.

[84]  Hirohisa Kishino,et al.  Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[85]  J. Lundberg,et al.  An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants : APG II THE ANGIOSPERM PHYLOGENY GROUP * , 2003 .

[86]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[87]  M. Hasebe,et al.  Phylogeny of the sundews, Drosera (Droseraceae), based on chloroplast rbcL and nuclear 18S ribosomal DNA Sequences. , 2003, American journal of botany.

[88]  M. Chase,et al.  Molecular phylogenetics of Caryophyllales based on nuclear 18S rDNA and plastid rbcL, atpB, and matK DNA sequences. , 2002, American journal of botany.

[89]  S. Dongen Graph clustering by flow simulation , 2000 .

[90]  M. Purugganan,et al.  Interspecific hybrid ancestry of a plant adaptive radiation: allopolyploidy of the Hawaiian silversword alliance (Asteraceae) inferred from floral homeotic gene duplications. , 1999, Molecular biology and evolution.

[91]  F. Brinkman,et al.  Phylogenetic analysis. , 1998, Methods of biochemical analysis.