Genome-wide functional analysis of human 5' untranslated region introns

BackgroundApproximately 35% of human genes contain introns within the 5' untranslated region (UTR). Introns in 5'UTRs differ from those in coding regions and 3'UTRs with respect to nucleotide composition, length distribution and density. Despite their presumed impact on gene regulation, the evolution and possible functions of 5'UTR introns remain largely unexplored.ResultsWe performed a genome-scale computational analysis of 5'UTR introns in humans. We discovered that the most highly expressed genes tended to have short 5'UTR introns rather than having long 5'UTR introns or lacking 5'UTR introns entirely. Although we found no correlation in 5'UTR intron presence or length with variance in expression across tissues, which might have indicated a broad role in expression-regulation, we observed an uneven distribution of 5'UTR introns amongst genes in specific functional categories. In particular, genes with regulatory roles were surprisingly enriched in having 5'UTR introns. Finally, we analyzed the evolution of 5'UTR introns in non-receptor protein tyrosine kinases (NRTK), and identified a conserved DNA motif enriched within the 5'UTR introns of human NRTKs.ConclusionsOur results suggest that human 5'UTR introns enhance the expression of some genes in a length-dependent manner. While many 5'UTR introns are likely to be evolving neutrally, their relationship with gene expression and overrepresentation among regulatory genes, taken together, suggest that complex evolutionary forces are acting on this distinct class of introns.

[1]  Walter Gilbert,et al.  The evolution of spliceosomal introns: patterns, puzzles and progress , 2006, Nature Reviews Genetics.

[2]  A. Sandelin,et al.  Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. , 2004, Journal of molecular biology.

[3]  L. Hurst,et al.  Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively driven codon usage. , 2004, Molecular biology and evolution.

[4]  R. Qu,et al.  Gene expression enhancement mediated by the 5′ UTR intron of the rice rubi3 gene varied remarkably among tissues in transgenic rice plants , 2008, Molecular Genetics and Genomics.

[5]  Rahul Siddharthan,et al.  Detecting regulatory sites using PhyloGibbs. , 2007, Methods in molecular biology.

[6]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[7]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[8]  D. Cane,et al.  The nonsense-mediated decay RNA surveillance pathway. , 2007, Annual review of biochemistry.

[9]  A. Vinogradov "Genome design" model: evidence from conserved intronic sequence in human-mouse comparison. , 2006, Genome research.

[10]  A. Rose The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. , 2004, The Plant journal : for cell and molecular biology.

[11]  R. Palmiter,et al.  Heterologous introns can enhance expression of transgenes in mice. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[12]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[13]  H. Le Hir,et al.  How introns influence and enhance eukaryotic gene expression. , 2003, Trends in biochemical sciences.

[14]  F. Ayala,et al.  Origins and evolution of spliceosomal introns. , 2006, Annual review of genetics.

[15]  Douglas G Scofield,et al.  Intron size, abundance, and distribution within untranslated regions of genes. , 2006, Molecular biology and evolution.

[16]  M. Thattai,et al.  Intrinsic noise in gene regulatory networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Zeileis,et al.  zoo: S3 Infrastructure for Regular and Irregular Time Series , 2005, math/0505527.

[18]  D. Niu Protecting exons from deleterious R-loops: a potential advantage of having introns , 2007, Biology Direct.

[19]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[20]  A. Vinogradov ‘Genome design’ model and multicellular complexity: golden middle , 2006, Nucleic acids research.

[21]  David Haussler,et al.  Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays , 2005, PLoS Comput. Biol..

[22]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[23]  Cristian I. Castillo-Davis,et al.  Selection for short introns in highly expressed genes , 2002, Nature Genetics.

[24]  A. Tsygankov Non-receptor protein tyrosine kinases. , 2003, Frontiers in bioscience : a journal and virtual library.

[25]  Erik van Nimwegen,et al.  PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny , 2005, PLoS Comput. Biol..

[26]  J. Jonsson,et al.  Intron requirement for expression of the human purine nucleoside phosphorylase gene. , 1992, Nucleic acids research.

[27]  C. Gissi,et al.  Structural and functional features of eukaryotic mRNA untranslated regions. , 2001, Gene.

[28]  S Brunak,et al.  Analysis and recognition of 5 ¢ UTR intron splice sites in human pre-mRNA , 2003 .

[29]  G. Parra,et al.  Promoter-Proximal Introns in Arabidopsis thaliana Are Enriched in Dispersed Signals that Elevate Gene Expression[W][OA] , 2008, The Plant Cell Online.

[30]  C. Burge,et al.  Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. , 2008, RNA.

[31]  L. Maquat Nonsense-mediated mRNA decay in mammals , 2005, Journal of Cell Science.

[32]  Qiang Zhou,et al.  Stimulatory effect of splicing factors on transcriptional elongation , 2001, Nature.

[33]  Chris Sander,et al.  Characterizing gene sets with FuncAssociate , 2003, Bioinform..

[34]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[35]  L. Duret,et al.  Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[36]  J. Blenis,et al.  SKAR Links Pre-mRNA Splicing to mTOR/S6K1-Mediated Enhanced Translation Efficiency of Spliced mRNAs , 2008, Cell.

[37]  J. Curran,et al.  Alternatively spliced isoforms of the human elk-1 mRNA within the 5′ UTR: implications for ELK-1 expression , 2007, Nucleic acids research.

[38]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[40]  Laura Smith Post-transcriptional regulation of gene expression by alternative 5'-untranslated regions in carcinogenesis. , 2008, Biochemical Society transactions.

[41]  Panayiotis V. Benos,et al.  STAMP: a web tool for exploring DNA-binding motif similarities , 2007, Nucleic Acids Res..

[42]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[43]  F. Clark,et al.  Understanding alternative splicing: towards a cellular code , 2005, Nature Reviews Molecular Cell Biology.

[44]  M. Kreitman,et al.  The correlation between intron length and recombination in drosophila. Dynamic equilibrium between mutational and selective forces. , 2000, Genetics.

[45]  J. Nap,et al.  In plants, highly expressed genes are the least compact. , 2006, Trends in genetics : TIG.

[46]  D. Penny,et al.  Evolutionary conservation of UTR intron boundaries in Cryptococcus. , 2007, Molecular biology and evolution.

[47]  Jurg Ott,et al.  Distribution and characterization of regulatory elements in the human genome. , 2002, Genome research.

[48]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[49]  A. Vinogradov Compactness of human housekeeping genes: selection for economy or genomic design? , 2004, Trends in genetics : TIG.

[50]  A. Furger,et al.  Promoter proximal splice sites enhance transcription. , 2002, Genes & development.

[51]  Araxi O. Urrutia,et al.  The signature of selection mediated by expression on human genes. , 2003, Genome research.

[52]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[53]  M. Conley,et al.  Transcriptional regulatory elements within the first intron of Bruton's tyrosine kinase. , 1998, Blood.

[54]  Francesca Chiaromonte,et al.  Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. , 2005, Genome research.

[55]  K. Huse,et al.  Functional characterization of two novel 5' untranslated exons reveals a complex regulation of NOD2 protein expression , 2007, BMC Genomics.

[56]  M. Lynch,et al.  Position of the final intron in full-length transcripts: determined by NMD? , 2007, Molecular biology and evolution.

[57]  E. Levanon,et al.  Human housekeeping genes are compact. , 2003, Trends in genetics : TIG.

[58]  Frederick P. Roth,et al.  The Synergizer service for translating gene, protein and other biological identifiers , 2008, Bioinform..

[59]  L. Hurst,et al.  The Small Introns of Antisense Genes Are Better Explained by Selection for Rapid Transcription Than by “Genomic Design” , 2005, Genetics.

[60]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[61]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[62]  D. Neafsey,et al.  Complex selection on intron size in Cryptococcus neoformans. , 2008, Molecular biology and evolution.

[63]  B. Blencowe Alternative Splicing: New Insights from Global Analyses , 2006, Cell.

[64]  Christopher J. Lee,et al.  Alternative splicing and RNA selection pressure — evolutionary consequences for eukaryotic genomes , 2006, Nature Reviews Genetics.

[65]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[66]  E. Koonin,et al.  Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution , 2003, Current Biology.

[67]  B. Jasmin,et al.  An intronic enhancer containing an N-box motif is required for synapse- and tissue-specific expression of the acetylcholinesterase gene in skeletal muscle fibers. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[68]  L. Hurst,et al.  Human antisense genes have unusually short introns: evidence for selection for rapid transcription. , 2005, Trends in genetics : TIG.

[69]  R. Palmiter,et al.  Introns increase transcriptional efficiency in transgenic mice. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[70]  E. Koonin,et al.  Patterns of intron gain and conservation in eukaryotic genes , 2007, BMC Evolutionary Biology.

[71]  L. G. Davis,et al.  Basic methods in molecular biology , 1986 .

[72]  L. Duret,et al.  Why do genes have introns? Recombination might add a new piece to the puzzle. , 2001, Trends in genetics : TIG.