Compositae-ParaLoss-1272: Complementary sunflower specific probe-set reduces issues with paralogs in complex systems

Premise The sunflower family specific probe set, Compositae-1061, has enabled family-wide phylogenomic studies and investigations at lower-taxonomic levels by targeting 1,000+ genes. However, it generally lacks resolution at the genus to species level, especially in groups with complex evolutionary histories including polyploidy and hybridization. Methods In this study, we developed a new Hyb-Seq probe set, Compositae-ParaLoss-1272, designed to target orthologous loci in Asteraceae family members. We tested its efficiency across the family by simulating target-enrichment sequencing in silico. Additionally, we tested its effectiveness at lower taxonomic levels in genus Packera which has a complex evolutionary and taxonomic history. We performed Hyb-Seq with Compositae-ParaLoss-1272 for 19 taxa which were previously studied using the Compositae-1061 probe set. Sequences from both probe sets were used to generate phylogenies, compare topologies, and assess node support. Results We report that Compositae-ParaLoss-1272 captured loci across all tested Asteraceae members. Additionally, Compositae-ParaLoss-1272 had less gene tree discordance, recovered considerably fewer paralogous sequences, and retained longer loci than Compositae-1061. Discussion Given the complexity of plant evolutionary histories, assigning orthology for phylogenomic analyses will continue to be challenging. However, we anticipate this new probe set will provide improved resolution and utility for studies at lower-taxonomic levels and complex groups in the sunflower family.

[1]  J. Mandel,et al.  Resolving evolutionary relationships in the groundsels: phylogenomics, divergence time estimates, and biogeography of Packera (Asteraceae: Senecioneae) , 2023, bioRxiv.

[2]  A. Leitch,et al.  Genome Insights into Autopolyploid Evolution: A Case Study in Senecio doronicum (Asteraceae) from the Southern Alps , 2022, Plants.

[3]  J. Soghigian,et al.  A New Pipeline for Removing Paralogs in Target Enrichment Data , 2021, Systematic biology.

[4]  J. Leebens-Mack,et al.  Target sequence capture in orchids: Developing a kit to sequence hundreds of single‐copy loci , 2021, Applications in plant sciences.

[5]  C. Rothfels Polyploid phylogenetics. , 2021, The New phytologist.

[6]  Thomas M. Keane,et al.  Twelve years of SAMtools and BCFtools , 2020, GigaScience.

[7]  William A. Freyman,et al.  Phylogenomics of Perityleae (Compositae) provides new insights into morphological and chromosomal evolution of the rock daisies , 2020, Journal of Systematics and Evolution.

[8]  Matthew W. Hahn,et al.  New Approaches for Inferring Phylogenies in the Presence of Paralogs. , 2020, Trends in genetics : TIG.

[9]  V. Funk,et al.  Phylogenomics Yields New Insight Into Relationships Within Vernonieae (Asteraceae) , 2019, Front. Plant Sci..

[10]  Rebecca B. Dikow,et al.  A fully resolved backbone phylogeny reveals numerous dispersals and explosive diversifications throughout the history of Asteraceae , 2019, Proceedings of the National Academy of Sciences.

[11]  Matthew G. Johnson,et al.  Bridging the micro- and macroevolutionary levels in phylogenomics: Hyb-Seq solves relationships from populations to species and above. , 2018, The New phytologist.

[12]  Matthew G. Johnson,et al.  A Universal Probe Set for Targeted Sequencing of 353 Nuclear Genes from Any Flowering Plant Designed Using k-Medoids Clustering , 2018, bioRxiv.

[13]  Chao Zhang,et al.  ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees , 2018, BMC Bioinformatics.

[14]  Matthew G. Johnson,et al.  Target sequence capture of nuclear‐encoded genes for phylogenetic analysis in ferns , 2018, Applications in plant sciences.

[15]  Michael R McKain,et al.  Practical considerations for plant phylogenomics , 2018, Applications in plant sciences.

[16]  J. Doyle,et al.  Targeting legume loci: A comparison of three methods for target enrichment bait design in Leguminosae phylogenomics , 2018, Applications in plant sciences.

[17]  L. Estes,et al.  NEW COMBINATIONS, RANK CHANGES, AND NOMENCLATURAL AND TAXONOMIC COMMENTS IN THE VASCULAR FLORA OF THE SOUTHEASTERN UNITED STATES , 2011 .

[18]  Oscar M. Vargas,et al.  Conflicting phylogenomic signals reveal a pattern of reticulate evolution in a recent high-Andean diversification (Asteraceae: Astereae: Diplostephium). , 2017, The New phytologist.

[19]  Joseph W. Brown,et al.  Phyx: phylogenetic tools for unix , 2017, Bioinform..

[20]  Matthew G. Johnson,et al.  HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1 , 2016, Applications in Plant Sciences.

[21]  L. Landweber,et al.  What Is a Genome? , 2016, PLoS genetics.

[22]  Siavash Mirarab,et al.  Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies , 2016, Molecular biology and evolution.

[23]  S. Kelly,et al.  OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy , 2015, Genome Biology.

[24]  Stephen A. Smith,et al.  Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants , 2015, BMC Evolutionary Biology.

[25]  J. Mandel,et al.  A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae)1 , 2015, Applications in plant sciences.

[26]  Vivek Krishnakumar,et al.  MarkerMiner 1.0: A new application for phylogenetic marker development using angiosperm transcriptomes , 2015, Applications in plant sciences.

[27]  P. Kück,et al.  FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies , 2014, Frontiers in Zoology.

[28]  Mark Fishbein,et al.  Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics , 2014, Applications in plant sciences.

[29]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[30]  Burke,et al.  A target enrichment method for gathering phylogenetic information from hundreds of loci: An example from the Compositae , 2014, Applications in plant sciences.

[31]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[32]  Gregory W. Stull,et al.  A targeted enrichment strategy for massively parallel sequencing of angiosperm plastid genomes , 2013, Applications in plant sciences.

[33]  T. Givnish,et al.  Reticulate evolution on a global scale: a nuclear phylogeny for New World Dryopteris (Dryopteridaceae). , 2012, Molecular phylogenetics and evolution.

[34]  Liam J. Revell,et al.  phytools: an R package for phylogenetic comparative biology (and other things) , 2012 .

[35]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[36]  Klaus Peter Schliep,et al.  phangorn: phylogenetic analysis in R , 2010, Bioinform..

[37]  D. Soltis,et al.  Rapid Chromosome Evolution in Recently Formed Polyploids in Tragopogon (Asteraceae) , 2008, PloS one.

[38]  D. Hillis,et al.  Taxon sampling and the accuracy of phylogenetic analyses , 2008 .

[39]  Rüdiger Simon,et al.  The Receptor Kinase CORYNE of Arabidopsis Transmits the Stem Cell–Limiting Signal CLAVATA3 Independently of CLAVATA1[W] , 2008, The Plant Cell Online.

[40]  L. Watson,et al.  An ITS phylogeny of tribe Senecioneae (Asteraceae) and a new delimitation of Senecio L. , 2007 .

[41]  A. Gramling A Conservation Assessment of Packera millefolium, a Southern Appalachian Endemic , 2006 .

[42]  R. Veitia Paralogs in Polyploids: One for All and All for One? , 2005, The Plant Cell Online.

[43]  K. H. Wolfe Yesterday's polyploids and the mystery of diploidization , 2001, Nature Reviews Genetics.

[44]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[45]  J. Bain,et al.  A phylogeny of Packera (Senecioneae; asteraceae) based on internal transcribed spacer region sequence data and a broad sampling of outgroups. , 2000, Molecular phylogenetics and evolution.

[46]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[47]  D. Bray,et al.  Variation in pollen wall ultrastructure in New World Senecioneae (Asteraceae), with special reference to Packera , 1997 .

[48]  D. Weigel,et al.  LEAFY controls floral meristem identity in Arabidopsis , 1992, Cell.

[49]  T. Barkley Infrageneric groups in Senecio, S.L., and Cacalia, S.L. (Asteraceae: Senecioneae) in Mexico and Central America , 1985 .

[50]  S. B. Jones,et al.  Hybridization between Senecio smallii and S. tomentosus (Compositae) on the granitic flatrocks of the Southeastern United States , 1971, Brittonia.

[51]  E. Zerbin-Rüdin [Genetics in medicine]. , 1969, Wiener medizinische Wochenschrift.

[52]  E. Schilling,et al.  Barcoding the Asteraceae of Tennessee, Tribe Cichorieae , 2015 .

[53]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[54]  T. Barkley Variation among the Aureoid Senecios of North America: A geohistorical interpretation , 2008, The Botanical Review.

[55]  J. Bain Taxonomy of Senecio streptanthifolius Greene , 1988 .

[56]  L. Uttal SENECIO MILLEFOLIUM T. & G. (ASTERACEAE) AND ITS INTROGRESSANTS , 1984 .

[57]  M. L. Fernald Virginian botanizing under restrictions , 1943, Contributions from the Gray Herbarium of Harvard University.