Random sequences are an abundant source of bioactive RNAs or peptides

It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have an initial function that influences the fitness of an organism. Here, we have tested this question systematically, by expressing clones with random sequences in Escherichia coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.

[1]  C. Orengo,et al.  Protein families and their evolution-a structural perspective. , 2005, Annual review of biochemistry.

[2]  M. Madan Babu,et al.  A million peptide motifs for the molecular biologist. , 2014, Molecular cell.

[3]  Arndt von Haeseler,et al.  NextGenMap: fast and accurate read mapping in highly polymorphic genomes , 2013, Bioinform..

[4]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[5]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[6]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[7]  Jeffrey E. Barrick,et al.  Genome dynamics during experimental evolution , 2013, Nature Reviews Genetics.

[8]  D. Tautz,et al.  Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence , 2016, eLife.

[9]  Josephine A. Reinhardt,et al.  De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences , 2013, PLoS genetics.

[10]  D. Tautz,et al.  The evolutionary origin of orphan genes , 2011, Nature Reviews Genetics.

[11]  A. Dunker,et al.  Understanding protein non-folding. , 2010, Biochimica et biophysica acta.

[12]  D. Tautz The Discovery of De Novo Gene Evolution , 2014, Perspectives in biology and medicine.

[13]  César A. Hidalgo,et al.  Proto-genes and de novo gene birth , 2012, Nature.

[14]  F. Jacob,et al.  Evolution and tinkering. , 1977, Science.

[15]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[16]  Peter Tompa,et al.  Intrinsically disordered proteins: emerging interaction specialists. , 2015, Current opinion in structural biology.

[17]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[18]  C. Chothia Proteins. One thousand families for the molecular biologist. , 1992, Nature.

[19]  Dong-Sheng Cao,et al.  protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences , 2015, Bioinform..

[20]  Marketa Zvelebil,et al.  High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing , 2011, Genome Biology.

[21]  Li Zhao,et al.  Origin and Spread of de Novo Genes in Drosophila melanogaster Populations , 2014, Science.

[22]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[23]  C. Ponting,et al.  On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? , 2001, Journal of structural biology.

[24]  Anthony D. Keefe,et al.  Functional proteins from a random-sequence library , 2001, Nature.

[25]  Ying Li,et al.  Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs , 2012, PLoS genetics.

[26]  M. Albà,et al.  Long non-coding RNAs as a source of new peptides , 2014, eLife.

[27]  Guillaume Lamour,et al.  Promiscuity as a functional trait: intrinsically disordered regions as central players of interactomes. , 2013, The Biochemical journal.

[28]  G. Fox,et al.  Stress-driven in vivo selection of a functional mini-gene from a randomized DNA library expressing combinatorial peptides in Escherichia coli. , 2007, Molecular biology and evolution.