The ancient Salicoid genome duplication event: A platform for reconstruction of de Novo gene evolution in Populus trichocarpa.

Orphan genes are characteristic genomic features that have no detectable homology to genes in any other species and represent an important attribute of genome evolution as sources of novel genetic functions. Here, we identified 445 genes specific to Populus trichocarpa. Of these, we performed deeper reconstruction of 13 orphan genes to provide evidence of de novo gene evolution. Populus and its sister genera Salix are particularly well suited for the study of orphan gene evolution because of the Salicoid whole-genome duplication event (WGD) which resulted in highly syntenic sister chromosomal segments across the Salicaceae. We leveraged this genomic feature to reconstruct de novo gene evolution from inter-genera, inter-species, and intra-genomic perspectives by comparing the syntenic regions within the P. trichocarpa reference, then P. deltoides, and finally Salix purpurea. Furthermore, we demonstrated that 86.5% of the putative orphan genes had evidence of transcription. Additionally, we also utilized the Populus genome-wide association mapping panel (GWAS), a collection of 1,084 undomesticated P. trichocarpa genotypes to further determine putative regulatory networks of orphan genes using expression quantitative trait loci (eQTL) mapping. Functional enrichment of these eQTL subnetworks identified common biological themes associated with orphan genes such as response to stress and defense response. We also identify a putative cis-element for a de novo gene and leverage conserved synteny to describe evolution of a putative transcription factor binding site. Overall, 45% of orphan genes were captured in trans-eQTL networks.

[1]  J. Díez,et al.  Uncovering de novo gene birth in yeast using deep transcriptomics , 2021, Nature Communications.

[2]  K. Neuhaus,et al.  Are Antisense Proteins in Prokaryotes Functional? , 2020, bioRxiv.

[3]  Joshua L. Payne,et al.  Enhancers Facilitate the Birth of De Novo Genes and Gene Integration into Regulatory Networks , 2019, Molecular biology and evolution.

[4]  S. Kelly,et al.  OrthoFinder: phylogenetic orthology inference for comparative genomics , 2019, Genome Biology.

[5]  Zhiyu Peng,et al.  Rapid evolution of protein diversity by de novo origination in Oryza , 2019, Nature Ecology & Evolution.

[6]  Jing Li,et al.  synder: inferring genomic orthologs from synteny maps , 2019, bioRxiv.

[7]  Tzong-Yi Lee,et al.  PlantPAN3.0: a new and updated resource for reconstructing transcriptional regulatory networks from ChIP-seq experiments in plants , 2018, Nucleic Acids Res..

[8]  A. McLysaght,et al.  Computational Prediction of De Novo Emerged Protein-Coding Genes. , 2018, Methods in molecular biology.

[9]  R. Sommer,et al.  Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation , 2018, Genome research.

[10]  J. Flowers,et al.  Origins and geographic diversification of African rice (Oryza glaberrima) , 2018, bioRxiv.

[11]  Gerald A Tuskan,et al.  Genome-wide association studies and expression-based quantitative trait loci analyses reveal roles of HCT2 in caffeoylquinic acid biosynthesis and its regulation by defense-responsive transcription factors in Populus. , 2018, The New phytologist.

[12]  Dan Nettleton,et al.  QQS orphan gene and its interactor NF‐YC4 reduce susceptibility to pathogens and pests , 2018, Plant biotechnology journal.

[13]  Catherine André,et al.  Characterisation and functional predictions of canine long non-coding RNAs , 2018, Scientific Reports.

[14]  Kenneth L. McNally,et al.  Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza , 2018, Nature Genetics.

[15]  Adam M. Phillippy,et al.  MUMmer4: A fast and versatile genome alignment system , 2018, PLoS Comput. Biol..

[16]  Lennart Martens,et al.  moFF: a robust and automated approach to extract peptide ion intensities , 2016, Nature Methods.

[17]  J. Hagmann,et al.  On the Origin of De Novo Genes in Arabidopsis thaliana Populations , 2016, Genome biology and evolution.

[18]  Chase W. Nelson,et al.  SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data , 2015, Bioinform..

[19]  Z. Nikoloski,et al.  Young Genes out of the Male: An Insight from Evolutionary Age Analysis of the Pollen Transcriptome. , 2015, Molecular plant.

[20]  N. Yao,et al.  The Arabidopsis ceramidase AtACER functions in disease resistance and salt tolerance. , 2015, The Plant journal : for cell and molecular biology.

[21]  Narmada Thanki,et al.  CDD: NCBI's conserved domain database , 2014, Nucleic Acids Res..

[22]  Eve Syrkin Wurtele,et al.  Coming of age: orphan genes in plants. , 2014, Trends in plant science.

[23]  William Stafford Noble,et al.  Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis , 2014, Journal of proteome research.

[24]  Gerald A Tuskan,et al.  The willow genome and divergent evolution from poplar after the common genome duplication , 2014, Cell Research.

[25]  Li Zhao,et al.  Origin and Spread of de Novo Genes in Drosophila melanogaster Populations , 2014, Science.

[26]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[27]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[28]  W. Friedt,et al.  Jasmonate and ethylene dependent defence gene expression and suppression of fungal virulence factors: two essential mechanisms of Fusarium head blight resistance in wheat? , 2012, BMC Genomics.

[29]  César A. Hidalgo,et al.  Proto-genes and de novo gene birth , 2012, Nature.

[30]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[31]  Jeremy D. DeBarry,et al.  MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity , 2012, Nucleic acids research.

[32]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[33]  D. Tautz,et al.  The evolutionary origin of orphan genes , 2011, Nature Reviews Genetics.

[34]  Ke Wang,et al.  genBlastG: using BLAST searches to build homologous gene models , 2011, Bioinform..

[35]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[36]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[37]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[38]  Yan Zhang,et al.  A Human-Specific De Novo Protein-Coding Gene Associated with Human Brain Functions , 2010, PLoS Comput. Biol..

[39]  I. C. Lee,et al.  Auxin response factor 2 (ARF2) plays a major role in regulating auxin-mediated leaf longevity , 2010, Journal of experimental botany.

[40]  Davis J. McCarthy,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[41]  David G. Knowles,et al.  Recent de novo origin of human protein-coding genes. , 2009, Genome research.

[42]  Manyuan Long,et al.  A Rice Gene of De Novo Origin Negatively Regulates Pathogen-Induced Defense Response , 2009, PloS one.

[43]  L. Armengol,et al.  Origin of primate orphan genes: a comparative genomics approach. , 2008, Molecular biology and evolution.

[44]  Andrew H. Paterson,et al.  Synteny and Collinearity in Plant Genomes , 2008, Science.

[45]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[46]  Andrew D Kern,et al.  Evidence for de Novo Evolution of Testis-Expressed Genes in the Drosophila yakuba/Drosophila erecta Clade , 2007, Genetics.

[47]  S. Carroll,et al.  Emerging principles of regulatory evolution , 2007, Proceedings of the National Academy of Sciences.

[48]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[49]  Andrew D Kern,et al.  Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[51]  F. Jacob Evolution and tinkering. , 1977, Science.

[52]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[53]  Ira M. Hall,et al.  BEDTools: a flexible suite of utilities for comparing genomic features , 2010, Bioinform..

[54]  S. Ohno The Enormous Diversity in Genome Sizes of Fish as a Reflection of Nature's Extensive Experiments with Gene Duplication , 1970 .

[55]  Wendy Schackwitz,et al.  Nature Genetics Advance Online Publication Population Genomics of Populus Trichocarpa Identifies Signatures of Selection and Adaptive Trait Associations , 2022 .