Detection of novel 3' untranslated region extensions with 3' expression microarrays

BackgroundThe 3' untranslated regions (UTRs) of transcripts are not well characterized for many genes and often extend beyond the annotated regions. Since Affymetrix 3' expression arrays were designed based on expressed sequence tags, many probesets map to intergenic regions downstream of genes. We used expression information from these probesets to predict transcript extension beyond currently known boundaries.ResultsBased on our dataset encompassing expression in 22 different murine tissues, we identified 845 genes with predicted 3'UTR extensions. These extensions have a similar conservation as known 3'UTRs, which is distinctly higher than intergenic regions. We verified 8 of the predictions by PCR and found all of the predicted regions to be expressed. The method can be extended to other 3' expression microarray platforms as we demonstrate with human data. Additional confirming evidence was obtained from public paired end read data.ConclusionsWe show that many genes have 3'UTR regions extending beyond currently known gene regions and provide a method to identify such regions based on microarray expression data. Since 3' UTR contain microRNA binding sites and other stability determining regions, identification of the full length 3' UTR is important to elucidate posttranscriptional regulation.

[1]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[2]  M. Pericak-Vance,et al.  REEP1 mutation spectrum and genotype/phenotype correlation in hereditary spastic paraplegia type 31. , 2008, Brain : a journal of neurology.

[3]  Mark Gerstein,et al.  A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3' end RNA polyadenylation. , 2008, Genome research.

[4]  Atif Shahab,et al.  Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). , 2007, Genome research.

[5]  Bin Tian,et al.  PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes , 2007, Nucleic Acids Res..

[6]  Uwe Ohler,et al.  Spatial preferences of microRNA targets in 3' untranslated regions , 2007, BMC Genomics.

[7]  Maria A Stalteri,et al.  Give me shelter: the global housing crisis. , 2003, BMC Bioinformatics.

[8]  H. Clevers,et al.  AU-rich elements and alternative splicing in the β-catenin 3′UTR can influence the human β-catenin mRNA stability , 2006 .

[9]  D. Cooper,et al.  A systematic analysis of disease-associated variants in the 3′ regulatory regions of human protein-coding genes I: general principles and overview , 2006, Human Genetics.

[10]  M. Jonkman,et al.  Two type XVII collagen (BP180) mRNA transcripts in human keratinocytes: a long and a short form , 2000, Clinical and experimental dermatology.

[11]  D. Radzioch,et al.  Polymorphism in the 3′-untranslated region of TNFα mRNA impairs binding of the post-transcriptional regulatory protein HuR to TNFα mRNA , 2001 .

[12]  Chris M. Brown,et al.  Transterm—extended search facilities and improved integration with other databases , 2005, Nucleic Acids Res..

[13]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[14]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[15]  J. Lieberman,et al.  Mutations in the gene encoding the 3′-5′ DNA exonuclease TREX1 are associated with systemic lupus erythematosus , 2007, Nature Genetics.

[16]  D. Pipeleers,et al.  Expression and Functional Activity of Glucagon, Glucagon-Like Peptide I, and Glucose-Dependent Insulinotropic Peptide Receptors in Rat Pancreatic Islet Cells , 1996, Diabetes.

[17]  P. Sharp,et al.  Proliferating Cells Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites , 2008, Science.

[18]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[19]  H. Clevers,et al.  AU-rich elements and alternative splicing in the beta-catenin 3'UTR can influence the human beta-catenin mRNA stability. , 2006, Experimental cell research.

[20]  V. Volpini,et al.  A novel Wiskott-Aldrich syndrome protein (WASP) complex mutation identified in a WAS patient results in an aberrant product at the C-terminus from two transcripts with unusual polyA signals , 2006, Journal of Human Genetics.

[21]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[22]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[23]  I. van Mechelen,et al.  Using Ribosomal Protein Genes as Reference: A Tale of Caution , 2008, PloS one.

[24]  C. Morillo,et al.  Polymorphism in the 3' UTR of the IL12B gene is associated with Chagas' disease cardiomyopathy. , 2007, Microbes and infection.

[25]  A. van Hoof,et al.  Messenger RNA regulation: to translate or to degrade , 2008, The EMBO journal.

[26]  L. Lim,et al.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing. , 2007, Molecular cell.

[27]  S. Kuersten,et al.  The power of the 3′ UTR: translational control and development , 2003, Nature Reviews Genetics.

[28]  Laurence D. Hurst,et al.  Evidence for a preferential targeting of 3′-UTRs by cis-encoded natural antisense transcripts , 2005, Nucleic acids research.

[29]  Mihaela Zavolan,et al.  Inference of miRNA targets using evolutionary conservation and pathway analysis , 2007, BMC Bioinformatics.

[30]  T. Hughes,et al.  Regulation of gene expression by alternative untranslated regions. , 2006, Trends in genetics : TIG.

[31]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[32]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[33]  Sylvain Foissac,et al.  Efficient targeted transcript discovery via array-based normalization of RACE libraries , 2008, Nature Methods.

[34]  Andrew P Cope,et al.  Polymorphisms in the CD3Z Gene Influence TCRζ Expression in Systemic Lupus Erythematosus Patients and Healthy Controls1 , 2008, The Journal of Immunology.

[35]  D. Radzioch,et al.  Polymorphism in the 3'-untranslated region of TNFalpha mRNA impairs binding of the post-transcriptional regulatory protein HuR to TNFalpha mRNA. , 2001, Nucleic acids research.