Origin and Spread of de Novo Genes in Drosophila melanogaster Populations

Novel genes derived from ancestral noncoding sequences are polymorphic among fruit fly strains. Comparative genomic analyses have revealed that genes may arise from ancestrally nongenic sequence. However, the origin and spread of these de novo genes within populations remain obscure. We identified 142 segregating and 106 fixed testis-expressed de novo genes in a population sample of Drosophila melanogaster. These genes appear to derive primarily from ancestral intergenic, unexpressed open reading frames, with natural selection playing a significant role in their spread. These results reveal a heretofore unappreciated dynamism of gene content. Losses and Gains In order to better understand the process by which de novo genes originate, Zhao et al. (p. 769, published online 23 January) examined testis-based gene expression among Drosophila melanogaster strains and identified both fixed and polymorphic de novo genes. The results suggest that spontaneous activation of previously noncoding DNA may be an important factor in generating genetic novelty.

[1]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[2]  V. Hartenstein,et al.  Drosophila melanogaster , 2005 .

[3]  C. V. Jongeneel,et al.  Similarities and differences of polyadenylation signals in human and fly , 2006, BMC Genomics.

[4]  Jun S. Liu,et al.  An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments , 2002, Nature Biotechnology.

[5]  Sergio Verjovski-Almeida,et al.  Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue-specific patterns and enrichment in genes related to regulation of transcription , 2007, Genome Biology.

[6]  D. Tautz,et al.  Emergence of a New Gene from an Intergenic Region , 2009, Current Biology.

[7]  Huifeng Jiang,et al.  De Novo Origination of a New Protein-Coding Gene in Saccharomyces cerevisiae , 2008, Genetics.

[8]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[9]  J. Kennison,et al.  Anent the Genomics of Spermatogenesis in Drosophila melanogaster , 2013, PloS one.

[10]  Manyuan Long,et al.  New Genes in Drosophila Quickly Become Essential , 2010, Science.

[11]  Josephine A. Reinhardt,et al.  De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences , 2013, PLoS genetics.

[12]  Andrew D Kern,et al.  Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[13]  N L Kaplan,et al.  The "hitchhiking effect" revisited. , 1989, Genetics.

[14]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[15]  Andrew D Kern,et al.  Evidence for de Novo Evolution of Testis-Expressed Genes in the Drosophila yakuba/Drosophila erecta Clade , 2007, Genetics.

[16]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..

[17]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[18]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[19]  Colin N. Dewey,et al.  Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans , 2007, PLoS biology.

[20]  Yun Ding,et al.  On the origin of new genes in Drosophila. , 2008, Genome research.

[21]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[22]  D. Schluter,et al.  Adaptation from standing genetic variation. , 2008, Trends in ecology & evolution.

[23]  F. Tajima Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. , 1989, Genetics.

[24]  César A. Hidalgo,et al.  Proto-genes and de novo gene birth , 2012, Nature.

[25]  B. Graveley The developmental transcriptome of Drosophila melanogaster , 2010, Nature.

[26]  M. Babu,et al.  Cellular Strategies for Regulating Functional and Nonfunctional Protein Aggregation , 2012, Cell reports.

[27]  Qi Zhou,et al.  Sex-Biased Transcriptome Evolution in Drosophila , 2012, Genome biology and evolution.

[28]  G. Bouffard,et al.  Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis. , 2000, Genome research.

[29]  Peter J. Bickel,et al.  The Developmental Transcriptome of Drosophila melanogaster , 2010, Nature.

[30]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[31]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[32]  Zsuzsanna Dosztányi,et al.  ANCHOR: web server for predicting protein binding regions in disordered proteins , 2009, Bioinform..

[33]  D. Schwartz Genetic control of alcohol dehydrogenase--a competition model for regulation of gene action. , 1971, Genetics.

[34]  Birgit Eisenhaber,et al.  TM or not TM: transmembrane protein prediction with low false positive rate using DAS-TMfilter , 2004, Bioinform..

[35]  Manyuan Long,et al.  A Rice Gene of De Novo Origin Negatively Regulates Pathogen-Induced Defense Response , 2009, PloS one.

[36]  Alisha K Holloway,et al.  Recently Evolved Genes Identified From Drosophila yakuba and D. erecta Accessory Gland Expressed Sequence Tags , 2005, Genetics.

[37]  J. Beckmann,et al.  FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005 .

[38]  Dawei Li,et al.  The sequence and de novo assembly of the giant panda genome , 2010, Nature.

[39]  M. MacCoss,et al.  Proteomic discovery of previously unannotated, rapidly evolving seminal fluid genes in Drosophila. , 2009, Genome research.

[40]  David G. Knowles,et al.  Recent de novo origin of human protein-coding genes. , 2009, Genome research.

[41]  Doron Lancet,et al.  Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification , 2005, Bioinform..

[42]  V. Uversky Natively unfolded proteins: A point where biology waits for physics , 2002, Protein science : a publication of the Protein Society.

[43]  J. Hartigan,et al.  The Dip Test of Unimodality , 1985 .

[44]  Jaime Prilusky,et al.  FoldIndex copyright: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005, Bioinform..

[45]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[46]  John Maynard Smith,et al.  The hitch-hiking effect of a favourable gene. , 1974, Genetical research.

[47]  EXTENSIVE INTROGRESSION OF MITOCHONDRIAL DNA RELATIVE TO NUCLEAR GENES IN THE DROSOPHILA YAKUBA SPECIES GROUP , 2006, Evolution; international journal of organic evolution.

[48]  J. Hermisson,et al.  Soft Sweeps , 2005, Genetics.

[49]  S. Teichmann,et al.  RNA sequencing reveals two major classes of gene expression levels in metazoan cells , 2011, Molecular systems biology.

[50]  J. M. Smith,et al.  The hitch-hiking effect of a favourable gene. , 1974, Genetical research.

[51]  R. Sachidanandam,et al.  Comprehensive splice-site analysis using comparative genomics , 2006, Nucleic acids research.

[52]  J. Dow,et al.  Using FlyAtlas to identify better Drosophila melanogaster models of human disease , 2007, Nature Genetics.

[53]  Colin N. Dewey,et al.  Genomic Variation in Natural Populations of Drosophila melanogaster , 2012, Genetics.

[54]  James B. Brown,et al.  Global patterns of tissue-specific alternative polyadenylation in Drosophila. , 2012, Cell reports.

[55]  M. Long,et al.  Chromosomal Redistribution of Male-Biased Genes in Mammalian Evolution with Two Bursts of Gene Gain on the X Chromosome , 2010, PLoS biology.

[56]  V. Uversky,et al.  Why are “natively unfolded” proteins unstructured under physiologic conditions? , 2000, Proteins.

[57]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[58]  L M McIntyre,et al.  Allelic imbalance in Drosophila hybrid heads: exons, isoforms, and evolution. , 2012, Molecular biology and evolution.

[59]  Kevin R. Thornton,et al.  The Drosophila melanogaster Genetic Reference Panel , 2012, Nature.

[60]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[61]  T. Lumley,et al.  gplots: Various R Programming Tools for Plotting Data , 2015 .

[62]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..