Global estimation of the 3' untranslated region landscape using RNA sequencing.

The 3' untranslated region (3' UTR) of mRNA contains elements that play regulatory roles in polyadenylation, localization, translation efficiency, and mRNA stability. Despite the significance of the 3' UTR, there is no popular method for annotating 3' UTRs and for profiling their isoforms. Recently, poly(A)-position profiling by sequencing (3P-seq) and other similar methods have successfully been used to annotate 3' UTRs; however, they contain complex RNA-biochemical experimental steps, resulting in a low yield of products. In this paper, we propose heuristic and regression methods to estimate and quantify the usage of 3' UTRs with widely profiled RNA sequencing (RNA-seq) data. With this approach, the 3' UTR usage estimated from RNA-seq was found to be highly correlated to that of 3P-seq, and poly(A) cleavage signals of 3' UTRs were detected upstream of the predicted poly(A) cleavage sites. Our methods predicted greater number of 3' UTRs than 3P-seq, which allows the profiling of the 3' UTRs of most expressed genes in diverse cell-types, stages, and species. Hence, the computational RNA-seq method for the estimation of the 3' UTR landscape would be useful as a tool for studying not only the functional roles of 3' UTR but also gene regulation by 3' UTR in a cell type-specific context. The method is implemented in open-source code, which is available at http://big.hanyang.ac.kr/GETUTR.

[1]  B. Tian,et al.  Alternative cleavage and polyadenylation: the long and short of it. , 2013, Trends in biochemical sciences.

[2]  G. Yehia,et al.  Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing , 2012, Nature Methods.

[3]  T. Babak,et al.  A quantitative atlas of polyadenylation in five mammals , 2012, Genome research.

[4]  J. Manley,et al.  Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches , 2009, Nature Reviews Molecular Cell Biology.

[5]  P. Sharp,et al.  Proliferating Cells Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites , 2008, Science.

[6]  R. Elkon,et al.  Alternative cleavage and polyadenylation: extent, regulation and function , 2013, Nature Reviews Genetics.

[7]  C. Mayr,et al.  Widespread Shortening of 3′UTRs by Alternative Cleavage and Polyadenylation Activates Oncogenes in Cancer Cells , 2009, Cell.

[8]  James B. Brown,et al.  Global patterns of tissue-specific alternative polyadenylation in Drosophila. , 2012, Cell reports.

[9]  K. Nishida,et al.  Mechanisms and consequences of alternative polyadenylation. , 2011, Molecules and Cells.

[10]  D. Gautheret,et al.  Patterns of variant polyadenylation signal usage in human genes. , 2000, Genome research.

[11]  Fatih Ozsolak,et al.  RNA sequencing: advances, challenges and opportunities , 2011, Nature Reviews Genetics.

[12]  C. Burge,et al.  3′ UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts , 2013, Genome research.

[13]  Peter J. Shepard,et al.  Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. , 2011, RNA.

[14]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[15]  Andrew H. Beck,et al.  3′-End Sequencing for Expression Quantification (3SEQ) from Archival Tumor Samples , 2010, PloS one.

[16]  Joseph K. Pickrell,et al.  Understanding mechanisms underlying human gene expression variation with RNA sequencing , 2010, Nature.

[17]  Sebastian D. Mackowiak,et al.  The Landscape of C. elegans 3′UTRs , 2010, Science.

[18]  D. Bartel,et al.  Global analyses of the effect of different cellular contexts on microRNA targeting. , 2014, Molecular cell.

[19]  David Baltimore,et al.  Synthesis of secreted and membrane-bound immunoglobulin mu heavy chains is directed by mRNAs that differ at their 3′ ends , 1980, Cell.

[20]  N. Rajewsky,et al.  Cell-type-specific signatures of microRNAs on target mRNA expression. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[22]  D. Bartel,et al.  Formation, Regulation and Evolution of Caenorhabditis elegans 3′UTRs , 2010, Nature.

[23]  D. Bartel,et al.  Extensive alternative polyadenylation during zebrafish development , 2012, Genome research.

[24]  Bin Tian,et al.  A large-scale analysis of mRNA polyadenylation of human and mouse genes , 2005, Nucleic acids research.

[25]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[26]  C. Witzgall,et al.  Projections onto order simplexes , 1984 .

[27]  N. Proudfoot Ending the message: poly(A) signals then and now. , 2011, Genes & development.

[28]  H. D. Brunk,et al.  The Isotonic Regression Problem and its Dual , 1972 .

[29]  P. Sharp,et al.  DNA-dependent transcription of adenovirus genes in a soluble whole-cell extract. , 1980, Proceedings of the National Academy of Sciences of the United States of America.