Gene Gain and Loss from the Asian Corn Borer W Chromosome

We built a chromosome-level genome assembly of the Asian corn borer, Ostrinia furnacalis Guenée (Lepidoptera: Pyralidae, Pyraloidea), an economically important pest in corn, from a female, including both the Z and W chromosome. Despite deep conservation of the Z chromosome across Lepidoptera, our chromosome-level W assembly reveals little conservation with available W chromosome sequence in related species or with the Z chromosome, consistent with a non-canonical origin of the W chromosome. The W chromosome has accumulated significant repetitive elements and experienced rapid gene gain from the remainder of the genome, with most genes exhibiting pseudogenization after duplication to the W. The genes that retain significant expression are largely enriched for functions in DNA recombination, the nucleosome, chromatin and DNA binding, likely related to meiotic and mitotic processes within the female gonad.

[1]  B. Kempenaers,et al.  Micro Germline-Restricted Chromosome in Blue Tits: Evidence for Meiotic Functions , 2023, Molecular biology and evolution.

[2]  William T. Harvey,et al.  The complete sequence of a human Y chromosome , 2022, bioRxiv.

[3]  Simon H. Martin,et al.  The Dryas iulia Genome Supports Multiple Gains of a W Chromosome from a B Chromosome in Butterflies , 2021, Genome biology and evolution.

[4]  T. Pizzari,et al.  Multi-copy gene family evolution on the avian W chromosome. , 2021, The Journal of heredity.

[5]  Heng Li,et al.  Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm , 2021, Nature Methods.

[6]  Anushya Muruganujan,et al.  The Gene Ontology resource: enriching a GOld mine , 2020, Nucleic Acids Res..

[7]  Ben Fulton,et al.  CAFE 5 models variation in evolutionary rates among gene families , 2020, Bioinform..

[8]  Xinhai Ye,et al.  A chromosome‐level genome assembly of rice leaffolder, Cnaphalocrocis medinalis , 2020, Molecular ecology resources.

[9]  Qi Zhou,et al.  The avian W chromosome is a refugium for endogenous retroviruses with likely effects on female-biased mutational load and genetic incompatibilities , 2020, bioRxiv.

[10]  D. Page,et al.  Dosage-sensitive functions in embryonic development drove the survival of genes on sex-specific chromosomes in snakes, birds, and mammals , 2020, bioRxiv.

[11]  Xinhai Ye,et al.  The genetic adaptations of fall armyworm Spodoptera frugiperda facilitated its rapid global dispersal and invasion , 2020, Molecular ecology resources.

[12]  Cédric Feschotte,et al.  RepeatModeler2 for automated genomic discovery of transposable element families , 2020, Proceedings of the National Academy of Sciences.

[13]  Robert S. Harris,et al.  Dynamic evolution of great ape Y chromosomes , 2020, Proceedings of the National Academy of Sciences.

[14]  F. Zhang,et al.  A chromosome-level genome assembly for the beet armyworm (Spodoptera exigua) using PacBio and Hi-C sequencing , 2019, bioRxiv.

[15]  G. Faulkner,et al.  Overcoming challenges and dogmas to understand the functions of pseudogenes , 2019, Nature Reviews Genetics.

[16]  Wei Fan,et al.  A chromosome-level genome assembly of Cydia pomonella provides insights into chemical ecology and insecticide resistance , 2019, Nature Communications.

[17]  Z. Fei,et al.  A high‐quality chromosome‐level genome assembly of a generalist herbivore, Trichoplusia ni , 2019, Molecular ecology resources.

[18]  J. Macas,et al.  Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification , 2019, Mobile DNA.

[19]  Anthony R. Borneman,et al.  Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies , 2018, BMC Bioinformatics.

[20]  S. Kelly,et al.  OrthoFinder: phylogenetic orthology inference for comparative genomics , 2019, Genome Biology.

[21]  Jia Gu,et al.  fastp: an ultra-fast all-in-one FASTQ preprocessor , 2018, bioRxiv.

[22]  Zhiping Weng,et al.  The genome of the Hi5 germ cell line from Trichoplusia ni, an agricultural pest and novel model for small RNA biology , 2018, eLife.

[23]  Beatriz Vicoso,et al.  The deep conservation of the Lepidoptera Z chromosome suggests a non-canonical origin of the W , 2017, Nature Communications.

[24]  C. Schlötterer,et al.  High rate of translocation-based gene birth on the Drosophila Y chromosome , 2017, Proceedings of the National Academy of Sciences.

[25]  Uwe Scholz,et al.  MISA-web: a web server for microsatellite prediction , 2017, Bioinform..

[26]  K. Makova,et al.  Y and W Chromosome Assemblies: Approaches and Discoveries. , 2017, Trends in genetics : TIG.

[27]  Nancy F. Chen,et al.  Avian W and mammalian Y chromosomes convergently retained dosage-sensitive regulators , 2017, Nature Genetics.

[28]  D. Bachtrog,et al.  Sex Determination, Sex Chromosomes, and Karyotype Evolution in Insects , 2016, The Journal of heredity.

[29]  Jens Keilwagen,et al.  Using intron position conservation for homology-based gene prediction , 2016, Nucleic acids research.

[30]  J. Walters,et al.  Neo-sex Chromosomes in the Monarch Butterfly, Danaus plexippus , 2016, G3: Genes, Genomes, Genetics.

[31]  Jean-Philippe Vert,et al.  HiC-Pro: an optimized and flexible pipeline for Hi-C data processing , 2015, Genome Biology.

[32]  Jean-Philippe Vert,et al.  HiC-Pro: an optimized and flexible pipeline for Hi-C data processing , 2015, Genome Biology.

[33]  Juan Moreno,et al.  Evolutionary analysis of the female-specific avian W chromosome , 2015, Nature Communications.

[34]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[35]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[36]  A. von Haeseler,et al.  IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies , 2014, Molecular biology and evolution.

[37]  Alexandre Lomsadze,et al.  Identification of protein coding regions in RNA transcripts , 2014, BCB.

[38]  M. Kirkpatrick,et al.  Sex Determination: Why So Many Ways of Doing It? , 2014, PLoS biology.

[39]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[40]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[41]  D. Bachtrog,et al.  Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration , 2013, Nature Reviews Genetics.

[42]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[43]  Robert D. Finn,et al.  Dfam: a database of repetitive DNA based on profile hidden Markov models , 2012, Nucleic Acids Res..

[44]  Frédéric Delsuc,et al.  MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons , 2011, PloS one.

[45]  K. Sahara,et al.  Rise and Fall of the W Chromosome in Lepidoptera , 2009 .

[46]  A. Clark,et al.  Low conservation of gene content in the Drosophila Y chromosome , 2008, Nature.

[47]  Nansheng Chen,et al.  Genblasta: Enabling Blast to Identify Homologous Gene Sequences , 2022 .

[48]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[49]  Stefan Kurtz,et al.  LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons , 2008, BMC Bioinformatics.

[50]  Jonathan E. Allen,et al.  Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments , 2007, Genome Biology.

[51]  F. Marec,et al.  Molecular divergence of the W chromosomes in pyralid moths (Lepidoptera) , 2007, Chromosome Research.

[52]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[53]  Zhao Xu,et al.  LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons , 2007, Nucleic Acids Res..

[54]  C. Feschotte,et al.  Mavericks, a novel class of giant transposable elements widespread in eukaryotes and related to DNA viruses. , 2007, Gene.

[55]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[56]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[57]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[58]  H. Abe,et al.  Retrotransposable elements on the W chromosome of the silkworm, Bombyx mori , 2005, Cytogenetic and Genome Research.

[59]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[60]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[61]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[62]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[63]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[64]  S. Eddy,et al.  Automated de novo identification of repeat sequence families in sequenced genomes. , 2002, Genome research.

[65]  V. Lukhtanov Sex chromatin and sex chromosome systems in nonditrysian Lepidoptera (Insecta) , 2000 .

[66]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[67]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[68]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[69]  F. Marec,et al.  Sex Chromatin in Lepidoptera , 1996, The Quarterly Review of Biology.

[70]  K. Sahara,et al.  Sex chromosome evolution in moths and butterflies , 2011, Chromosome Research.

[71]  Pavel A. Pevzner,et al.  De novo identification of repeat families in large genomes , 2005, ISMB.

[72]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[73]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.