Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies

Heliconius butterflies, a speciose genus of Müllerian mimics, represent a classic example of an adaptive radiation involving a range of derived dietary, life history, physiological and neural traits. However, key lineages within the genus, and across the broader Heliconiini tribe, lack genomic resources, contrasting our understanding of how adaptive and neutral processes shaped genome evolution across their radiation. Here, we build new, highly-contiguous genome assemblies for nine new Heliconiini, reference-assembled genomes for 29 species, and improve 10 existing assemblies, to provide a major new dataset of annotated genomes for 63 species, including 58 species within the Heliconiini tribe. We provide a robust, dated heliconiine phylogeny, identify major patterns of introgression, explore the evolution of genome size, content, and the genomic basis of key innovations in this enigmatic group for the first time. We illustrate how dense genomic sampling improves our resolution of gene-phenotype links, and our understanding of how genomes evolve. Teaser Dense sampling reveals the genomic basis of key innovations in an enigmatic tribe of butterflies.

[1]  O. S.,et al.  Accurate prediction of protein structures and interactions using a three-track neural network , 2022, Yearbook of Paediatric Endocrinology.

[2]  J. Marsh,et al.  Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure , 2021, Nature Communications.

[3]  Aaron A. Comeault,et al.  Widespread introgression across a phylogeny of 155 Drosophila genomes , 2020, Current Biology.

[4]  Andriy Kryshtafovych,et al.  Assessing the accuracy of contact and distance predictions in CASP14 , 2021, Proteins.

[5]  Huanming Yang,et al.  Comparative genomics provides insights into the aquatic adaptations of mammals , 2021, Proceedings of the National Academy of Sciences.

[6]  K. M. Kozak,et al.  Rampant Genome-Wide Admixture across the Heliconius Radiation , 2021, Genome biology and evolution.

[7]  Heng Li,et al.  Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm , 2021, Nature Methods.

[8]  Simon H. Martin,et al.  Chromosome Fusion Affects Genetic Diversity and Evolutionary Turnover of Functional Loci but Consistently Depends on Chromosome Size , 2021, bioRxiv.

[9]  M. Sahu,et al.  Neuronal Hippo signaling: From development to diseases , 2020, Developmental neurobiology.

[10]  S. Montgomery,et al.  Pollen feeding in Heliconius butterflies: the singular evolution of an adaptive suite , 2020, Proceedings of the Royal Society B.

[11]  J. Mallet,et al.  Synteny-Based Genome Assembly for 16 Species of Heliconius Butterflies, and an Assessment of Structural Variation across the Genus , 2020, bioRxiv.

[12]  Marco Y. Hein,et al.  The Hippo pathway controls myofibril assembly and muscle fiber growth by regulating sarcomeric gene expression , 2020, bioRxiv.

[13]  Paul H. Williams,et al.  Genus-Wide Characterization of Bumblebee Genomes Provides Insights into Their Evolution and Variation in Ecological and Behavioral Traits , 2020, Molecular biology and evolution.

[14]  Hui Xiang,et al.  Cocoonase is indispensable for Lepidoptera insects breaking the sealed cocoon , 2020, PLoS genetics.

[15]  D. Erwin A conceptual framework of evolutionary novelty and innovation , 2020, Biological reviews of the Cambridge Philosophical Society.

[16]  Mario Stanke,et al.  BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database , 2020, bioRxiv.

[17]  John K. Colbourne,et al.  Broccoli: combining phylogenetic and network analyses for orthology assignment , 2019, bioRxiv.

[18]  C. Olsen,et al.  The dynamics of cyanide defences in the life cycle of an aposematic butterfly: biosynthesis versus sequestration. , 2019, Insect biochemistry and molecular biology.

[19]  J. Januschke,et al.  Where does asymmetry come from? Illustrating principles of polarity and asymmetry establishment in Drosophila neuroblasts. , 2019, Current opinion in cell biology.

[20]  Stephanie J. Spielman,et al.  HyPhy 2.5 - a customizable platform for evolutionary hypothesis testing using phylogenies. , 2019, Molecular biology and evolution.

[21]  David Haussler,et al.  Progressive alignment with Cactus: a multiple-genome aligner for the thousand-genome era , 2019, bioRxiv.

[22]  D. Di Marino,et al.  Genomic signature of shifts in selection in a sub-alpine ant and its physiological adaptations , 2019, bioRxiv.

[23]  Milan Malinsky,et al.  Dsuite - fast D-statistics and related admixture evidence from VCF files , 2019, bioRxiv.

[24]  S. Bak,et al.  Sequestration and biosynthesis of cyanogenic glucosides in passion vine butterflies and consequences for the diversification of their host plants , 2019, Ecology and evolution.

[25]  Xingang Wang,et al.  RaGOO: fast and accurate reference-guided scaffolding of draft genomes , 2019, Genome Biology.

[26]  H. Philippe,et al.  Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences , 2019, BMC Evolutionary Biology.

[27]  S. Blair,et al.  Fat-regulated adaptor protein Dlish binds the growth suppressor Expanded and controls its stability and ubiquitination , 2019, Proceedings of the National Academy of Sciences.

[28]  Yan Zhang,et al.  LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly , 2018, GigaScience.

[29]  C. Klämbt,et al.  Drosophila glia: Few cell types and many conserved functions , 2018, Glia.

[30]  Andrew J. Blumberg,et al.  Genomic architecture and introgression shape a butterfly radiation , 2018, Science.

[31]  D. Swarbreck,et al.  Efficient and accurate detection of splice junctions from RNA-seq with Portcullis , 2017, bioRxiv.

[32]  A. Borneman,et al.  Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies , 2018, BMC Bioinformatics.

[33]  Justin Chu,et al.  Tigmint: correcting assembly errors using linked reads from large molecules , 2018, BMC Bioinformatics.

[34]  V. Ranwez,et al.  MACSE v2: Toolkit for the Alignment of Coding Sequences Accounting for Frameshifts and Stop Codons , 2018, Molecular biology and evolution.

[35]  Jue Ruan,et al.  LRScaf: Improving Draft Genomes Using Long Noisy Reads , 2018, bioRxiv.

[36]  Brent S. Pedersen,et al.  GOATOOLS: A Python library for Gene Ontology analyses , 2018, Scientific Reports.

[37]  Gregg W. C. Thomas,et al.  Evolution of salivary glue genes in Drosophila species , 2018, bioRxiv.

[38]  Iker Irisarri,et al.  PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences , 2018, Bioinform..

[39]  David M. Curran,et al.  MIPhy: identify and quantify rapidly evolving members of large gene families , 2018, PeerJ.

[40]  Chao Zhang,et al.  ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees , 2018, BMC Bioinformatics.

[41]  Camilo Salazar,et al.  Recombination rate variation shapes barriers to introgression across butterfly genomes , 2018, bioRxiv.

[42]  S. Blair,et al.  Big roles for Fat cadherins. , 2018, Current opinion in cell biology.

[43]  Cody E. Hinchliff,et al.  Quartet Sampling distinguishes lack of support from conflicting support in the green plant tree of life. , 2018, American journal of botany.

[44]  Timothy J. Thurman,et al.  Facultative pupal mating in Heliconius erato: Implications for mate choice, female preference, and speciation , 2018, Ecology and evolution.

[45]  S. C. Hughes,et al.  Moesin is involved in polarity maintenance and cortical remodeling during asymmetric cell division , 2017, Molecular biology of the cell.

[46]  C. Butts,et al.  Evolutionary and structural analyses uncover a role for solvent interactions in the diversification of cocoonases in butterflies , 2017, Proceedings of the Royal Society B: Biological Sciences.

[47]  Justin Chu,et al.  ARCS: scaffolding genome drafts with linked reads , 2017, Bioinform..

[48]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[49]  David Haussler,et al.  Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation , 2017, bioRxiv.

[50]  J. Putney,et al.  Cytokine signaling through Drosophila Mthl10 ties lifespan to environmental stress , 2017, Proceedings of the National Academy of Sciences.

[51]  Kentaro K. Shimizu,et al.  Reference-guided de novo assembly approach improves genome reconstruction for related species , 2017, BMC Bioinformatics.

[52]  Shabhonam Caim,et al.  Leveraging multiple transcriptome assembly methods for improved gene structure annotation , 2017, bioRxiv.

[53]  Mark Blaxter,et al.  BlobTools: Interrogation of genome assemblies , 2017, F1000Research.

[54]  Sudhir Kumar,et al.  TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. , 2017, Molecular biology and evolution.

[55]  P. Marcatili,et al.  Positive diversifying selection is a pervasive adaptive force throughout the Drosophila radiation. , 2017, Molecular phylogenetics and evolution.

[56]  S. Koren,et al.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[57]  P. P. Olimpieri,et al.  Chemosensory adaptations of the mountain fly Drosophila nigrosparsa (Insecta: Diptera) through genomics’ and structural biology’s lenses , 2017, Scientific Reports.

[58]  C. Feschotte,et al.  Dynamics of genome size evolution in birds and mammals , 2017, Proceedings of the National Academy of Sciences.

[59]  David A. Lee,et al.  CATH: an expanded resource to predict protein function through structure and sequence , 2016, Nucleic Acids Res..

[60]  S. Thor,et al.  Ctr9, a Key Component of the Paf1 Complex, Affects Proliferation and Terminal Differentiation in the Developing Drosophila Nervous System , 2016, G3: Genes, Genomes, Genetics.

[61]  Sijun Zhu,et al.  Notch maintains Drosophila type II neuroblasts by suppressing expression of the Fez transcription factor Earmuff , 2016, Development.

[62]  Alessandra Carbone,et al.  Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence , 2016, PLoS Comput. Biol..

[63]  S. Ott,et al.  Brain composition in Heliconius butterflies, posteclosion growth and experience‐dependent neuropil plasticity , 2016, The Journal of comparative neurology.

[64]  S. Hobbie,et al.  Ecological Opportunity and Adaptive Radiation , 2016 .

[65]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[66]  René L. Warren,et al.  Sealer: a scalable gap-closing application for finishing draft genomes , 2015, BMC Bioinformatics.

[67]  David A. Lee,et al.  CATH FunFHMMer web server: protein functional annotations using functional family assignments , 2015, Nucleic Acids Res..

[68]  Sergei L. Kosakovsky Pond,et al.  UC Office of the President Recent Work Title Less Is More : An Adaptive Branch-Site Random Effects Model for Efficient Detection of Episodic Diversifying Selection Permalink , 2015 .

[69]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[70]  Gary D. Bader,et al.  Novel function discovery with GeneMANIA: a new integrated resource for gene function prediction in Escherichia coli , 2015, Bioinform..

[71]  Ben Murrell,et al.  RELAX: detecting relaxed selection in a phylogenetic framework. , 2014, Molecular biology and evolution.

[72]  James E. Allen,et al.  Highly evolvable malaria vectors: The genomes of 16 Anopheles mosquitoes , 2014, Science.

[73]  David A. Lee,et al.  CATH: comprehensive structural and functional annotations for genome sequences , 2014, Nucleic Acids Res..

[74]  James C. Schnable,et al.  ALLMAPS: robust scaffold ordering based on multiple maps , 2015, Genome Biology.

[75]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[76]  Gary D Bader,et al.  GeneMANIA: Fast gene network construction and function prediction for Cytoscape , 2014, F1000Research.

[77]  James Mallet,et al.  Multilocus Species Trees Show the Recent Adaptive Radiation of the Mimetic Heliconius Butterflies , 2014, bioRxiv.

[78]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[79]  S. Farris Evolution of Complex Higher Brain Centers and Behaviors: Behavioral Correlates of Mushroom Body Elaboration in Insects , 2013, Brain, Behavior and Evolution.

[80]  Colin N. Dewey,et al.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis , 2013, Nature Protocols.

[81]  Gary D. Bader,et al.  GeneMANIA Prediction Server 2013 Update , 2013, Nucleic Acids Res..

[82]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[83]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[84]  Thomas J. Hardcastle,et al.  Evaluating female remating rates in light of spermatophore degradation in Heliconius butterflies: pupal‐mating monandry versus adult‐mating polyandry , 2012 .

[85]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[86]  Sara Taskinen,et al.  smatr 3– an R package for estimation and inference about allometric lines , 2012 .

[87]  Liam J. Revell,et al.  phytools: an R package for phylogenetic comparative biology (and other things) , 2012 .

[88]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[89]  Q. Song,et al.  Identification of Lipases Involved in PBAN Stimulated Pheromone Production in Bombyx mori Using the DGE and RNAi Approaches , 2012, PloS one.

[90]  Sergei L. Kosakovsky Pond,et al.  A random effects branch-site model for detecting episodic diversifying selection. , 2011, Molecular biology and evolution.

[91]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[92]  Arul M Chinnaiyan,et al.  RNA-Seq unleashed , 2011, Nature Biotechnology.

[93]  David Haussler,et al.  Cactus: Algorithms for genome multiple sequence alignment. , 2011, Genome research.

[94]  Cole Trapnell,et al.  Computational methods for transcriptome annotation and quantification using RNA-seq , 2011, Nature Methods.

[95]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[96]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[97]  Fan Zhang,et al.  Two storage hexamerins from the beet armyworm Spodoptera exigua: Cloning, characterization and the effect of gene silencing on survival , 2010, BMC Molecular Biology.

[98]  C. Müller,et al.  Plant chemistry and insect sequestration , 2009, Chemoecology.

[99]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[100]  M. Csűrös Malin: maximum likelihood analysis of intron evolution in eukaryotes , 2008, Bioinformatics.

[101]  Miklós Csürös Malin: maximum likelihood analysis of intron evolution in eukaryotes , 2008, Bioinform..

[102]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[103]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[104]  S. Brunak,et al.  Locating proteins in the cell using TargetP, SignalP and related tools , 2007, Nature Protocols.

[105]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[106]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[107]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[108]  Hirohisa Kishino,et al.  Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[109]  Sudhindra R Gadagkar,et al.  Inferring species phylogenies from multiple genes: concatenated sequence tree versus consensus gene tree. , 2005, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[110]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[111]  P. Portin General outlines of the molecular genetics of the Notch signalling pathway in Drosophila melanogaster: a review. , 2002, Hereditas.

[112]  Robert C. Wolpert,et al.  A Review of the , 1985 .

[113]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .