Intron size, abundance, and distribution within untranslated regions of genes.

Most research concerning the evolution of introns has largely considered introns within coding sequences (CDSs), without regard for introns located within untranslated regions (UTRs) of genes. Here, we directly determined intron size, abundance, and distribution in UTRs of genes using full-length cDNA libraries and complete genome sequences for four species, Arabidopsis thaliana, Drosophila melanogaster, human, and mouse. Overall intron occupancy (introns/exon kbp) is lower in 5' UTRs than CDSs, but intron density (intron occupancy in regions containing introns) tends to be higher in 5' UTRs than in CDSs. Introns in 5' UTRs are roughly twice as large as introns in CDSs, and there is a sharp drop in intron size at the 5' UTR-CDS boundary. We propose a mechanistic explanation for the existence of selection for larger intron size in 5' UTRs, and outline several implications of this hypothesis. We found introns to be randomly distributed within 5' UTRs, so long as a minimum required exon size was assumed. Introns in 3' UTRs were much less abundant than in 5' UTRs. Though this was expected for human and mouse that have intron-dependent nonsense-mediated decay (NMD) pathways that discourage the presence of introns within the 3' UTR, it was also true for A. thaliana and D. melanogaster, which may lack intron-dependent NMD. Our findings have several implications for theories of intron evolution and genome evolution in general.

[1]  H. Le Hir,et al.  How introns influence and enhance eukaryotic gene expression. , 2003, Trends in biochemical sciences.

[2]  A. Clark,et al.  Genetic recombination: Intron size and natural selection , 1999, Nature.

[3]  C. Gissi,et al.  Structural and functional features of eukaryotic mRNA untranslated regions. , 2001, Gene.

[4]  T. Cavalier-smith,et al.  Intron phylogeny: a new hypothesis. , 1991, Trends in genetics : TIG.

[5]  S J de Souza,et al.  Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  G. Rubin,et al.  Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[7]  S Brunak,et al.  Analysis and recognition of 5 ¢ UTR intron splice sites in human pre-mRNA , 2003 .

[8]  S J de Souza,et al.  Intron positions correlate with module boundaries in ancient proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  G. Dreyfuss,et al.  Role of the Nonsense-Mediated Decay Factor hUpf3 in the Splicing-Dependent Exon-Exon Junction Complex , 2001, Science.

[10]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[11]  W. Gilbert,et al.  The exon theory of genes. , 1987, Cold Spring Harbor symposia on quantitative biology.

[12]  A. Stein,et al.  Introns of the chicken ovalbumin gene promote nucleosome alignment in vitro. , 1992, Nucleic acids research.

[13]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[14]  N. Dibb,et al.  Proto-splice site model of intron origin. , 1991, Journal of theoretical biology.

[15]  D. Petrov,et al.  How intron splicing affects the deletion and insertion profile in Drosophila melanogaster. , 2002, Genetics.

[16]  Eugene V Koonin,et al.  Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. , 2004, Nucleic acids research.

[17]  Michael Lynch,et al.  The evolution of spliceosomal introns. , 2002, Current opinion in genetics & development.

[18]  M. Kimura,et al.  On the probability of fixation of mutant genes in a population. , 1962, Genetics.

[19]  G. Fink,et al.  Pseudogenes in yeast? , 1987, Cell.

[20]  G. Rubin,et al.  A Drosophila full-length cDNA resource , 2002, Genome Biology.

[21]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[22]  Luciano Milanesi,et al.  Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon , 2001, Bioinform..

[23]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .

[24]  C. R. McClung,et al.  Intron loss and gain during evolution of the catalase gene family in angiosperms. , 1998, Genetics.

[25]  J. Weissenbach,et al.  Whole genome sequence comparisons and "full-length" cDNA sequences: a combined approach to evaluate and improve Arabidopsis genome annotation. , 2004, Genome research.

[26]  M. Lynch,et al.  Messenger RNA surveillance and the evolutionary proliferation of introns. , 2003, Molecular biology and evolution.

[27]  David W. Dyer,et al.  Introns and Splicing Elements of Five Diverse Fungi , 2004, Eukaryotic Cell.

[28]  P. Sharp,et al.  Spliced segments at the 5′ terminus of adenovirus 2 late mRNA* , 1977, Proceedings of the National Academy of Sciences.

[29]  R. Palmiter,et al.  Rat growth hormone gene introns stimulate nucleosome alignment in vitro and in transgenic mice. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[30]  L. Gottlieb,et al.  The 5' leader of plant PgiC has an intron: the leader shows both the loss and maintenance of constraints compared with introns and exons in the coding region. , 2002, Molecular biology and evolution.

[31]  Alexei Fedorov,et al.  Introns in gene evolution. , 2003 .

[32]  J E Darnell,et al.  Speculations on the early course of evolution. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Douglas G Scofield,et al.  The evolution of transcription-initiation sites. , 2005, Molecular biology and evolution.

[34]  M. Kreitman,et al.  The correlation between intron length and recombination in drosophila. Dynamic equilibrium between mutational and selective forces. , 2000, Genetics.

[35]  L. Maquat Nonsense-Mediated mRNA Decay: A Comparative Analysis of Different Species , 2004 .

[36]  Graziano Pesole,et al.  Evolutionary Dynamics of Mammalian MRNA Untranslated Regions by Comparative Analysis of Orthologous Human, Artiodactyl and Rodent Gene Pairs , 2002, Comput. Chem..

[37]  R. Macarthur ON THE RELATIVE ABUNDANCE OF BIRD SPECIES. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[38]  J. Darnell,et al.  The initiation sites for RNA transcription in Ad2 DNA , 1977, Cell.

[39]  S. J. Souza The Emergence of a Synthetic Theory of Intron Evolution , 2003, Genetica.

[40]  F. Crick,et al.  Selfish DNA: the ultimate parasite , 1980, Nature.

[41]  M. Moore,et al.  A quantitative analysis of intron effects on mammalian gene expression. , 2003, RNA.

[42]  S. Berget,et al.  In vivo recognition of a vertebrate mini-exon as an exon-intron-exon unit , 1993, Molecular and cellular biology.

[43]  Walter Gilbert,et al.  Complex early genes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[45]  The organization of Drosophila genes. , 1994, DNA sequence : the journal of DNA sequencing and mapping.

[46]  T. Cavalier-smith,et al.  Selfish DNA and the origin of introns , 1985, Nature.

[47]  K. Imaizumi,et al.  An Intronic Splicing Enhancer Element in Survival Motor Neuron (SMN) Pre-mRNA* , 2003, The Journal of Biological Chemistry.

[48]  R. Roberts,et al.  An amazing sequence arrangement at the 5′ ends of adenovirus 2 messenger RNA , 1977, Cell.

[49]  Tobias Mourier,et al.  Eukaryotic Intron Loss , 2003, Science.

[50]  T. Gregory,et al.  Insertion-deletion biases and the evolution of genome size. , 2004, Gene.

[51]  W. Ford Doolittle,et al.  Genes in pieces: were they ever together? , 1978, Nature.

[52]  S. Berget Exon Recognition in Vertebrate Splicing (*) , 1995, The Journal of Biological Chemistry.

[53]  R. Lewontin,et al.  Detecting heterogeneity of substitution along DNA and protein sequences. , 1996, Genetics.

[54]  L. Duret,et al.  Why do genes have introns? Recombination might add a new piece to the puzzle. , 2001, Trends in genetics : TIG.

[55]  C. Blake,et al.  Do genes-in-pieces imply proteins-in-pieces? , 1978, Nature.

[56]  J. Crow,et al.  THE NUMBER OF ALLELES THAT CAN BE MAINTAINED IN A FINITE POPULATION. , 1964, Genetics.

[57]  P. Senapathy,et al.  Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[58]  S J de Souza,et al.  Origin of genes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[59]  C. Gissi,et al.  Untranslated regions of mRNAs , 2002, Genome Biology.

[60]  A. Vinogradov Growth and decline of introns. , 2002, Trends in genetics : TIG.

[61]  Graziano Pesole,et al.  UTRdb and UTRsite: specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAs. Update 2002 , 2002, Nucleic Acids Res..

[62]  T. Maniatis,et al.  An extensive network of coupling among gene expression machines , 2002, Nature.

[63]  Michael Q. Zhang Computational prediction of eukaryotic protein-coding genes , 2002, Nature Reviews Genetics.

[64]  Cristian I. Castillo-Davis,et al.  Selection for short introns in highly expressed genes , 2002, Nature Genetics.

[65]  Gane Ka-Shu Wong,et al.  Minimal introns are not "junk". , 2002, Genome research.

[66]  R. Reed,et al.  Splicing is required for rapid and efficient mRNA export in metazoans. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[67]  R Kole,et al.  Cooperation of pre-mRNA sequence elements in splice site selection , 1992, Molecular and cellular biology.

[68]  John M Logsdon,et al.  The recent origins of introns , 1992, Current Biology.

[69]  S. Berget,et al.  Architectural limits on split genes. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[70]  R Kole,et al.  Selection of splice sites in pre-mRNAs with short internal exons , 1991, Molecular and cellular biology.

[71]  N. Gray,et al.  Regulation of mRNA translation by 5'- and 3'-UTR-binding factors. , 2003, Trends in biochemical sciences.

[72]  N. Proudfoot Dawdling polymerases allow introns time to splice , 2003, Nature Structural Biology.

[73]  M. Lynch Intron evolution as a population-genetic process , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[74]  J D Palmer,et al.  Intron "sliding" and the diversity of intron positions. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[75]  Russell F. Doolittle,et al.  Intron Distribution in Ancient Paralogs Supports Random Insertion and Not Random Loss , 1997, Journal of Molecular Evolution.

[76]  A. Vinogradov Intron–Genome Size Relationship on a Large Evolutionary Scale , 1999, Journal of Molecular Evolution.

[77]  A. Rose The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. , 2004, The Plant journal : for cell and molecular biology.

[78]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[79]  L. Maquat,et al.  A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. , 1998, Trends in biochemical sciences.

[80]  Qiang Zhou,et al.  Stimulatory effect of splicing factors on transcriptional elongation , 2001, Nature.

[81]  B. Kerem,et al.  Splicing regulation as a potential genetic modifier. , 2002, Trends in genetics : TIG.

[82]  M Ptashne,et al.  Transcription initiation: imposing specificity by localization. , 2001, Essays in biochemistry.

[83]  J. Vaughn,et al.  The Evolution of Single-Copy Drosophila Nuclear 4f-rnp Genes: Spliceosomal Intron Losses Create Polymorphic Alleles , 2002, Journal of Molecular Evolution.

[84]  L. Maquat Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics , 2004, Nature Reviews Molecular Cell Biology.

[85]  G. Danieli,et al.  Exon-intron organization of the human dystrophin gene. , 1997, Genomics.

[86]  E. Koonin,et al.  Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution , 2003, Current Biology.

[87]  Y. Xing,et al.  Aberrant splicing of intron 1 leads to the heterogeneous 5' UTR and decreased expression of waxy gene in rice cultivars of intermediate amylose content. , 1998, The Plant journal : for cell and molecular biology.

[88]  M. Long,et al.  Intron-exon structures of eukaryotic model organisms. , 1999, Nucleic acids research.

[89]  Graziano Pesole,et al.  UTRdb and UTRsite: specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAs , 2000, Nucleic Acids Res..

[90]  O. Kallioniemi,et al.  Cloning of BCAS3 (17q23) and BCAS4 (20q13) genes that undergo amplification, overexpression, and fusion in breast cancer † , 2002, Genes, chromosomes & cancer.

[91]  S. Berget,et al.  An intron splicing enhancer containing a G-rich repeat facilitates inclusion of a vertebrate micro-exon. , 1996, RNA.

[92]  W. Gilbert Why genes in pieces? , 1978, Nature.

[93]  Stephen M. Mount,et al.  Splicing signals in Drosophila: intron size, information content, and consensus sequences. , 1992, Nucleic acids research.