Evidence for transcript networks composed of chimeric RNAs in human cells

The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 59 and 39 transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network. Citation: Djebali S, Lagarde J, Kapranov P, Lacroix V, Borel C, et al. (2012) Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells. PLoS ONE 7(1): e28213. doi:10.1371/journal.pone.0028213 Editor: Thomas Preiss, The John Curtin School of Medical Research, Australia Received July 6, 2011; Accepted November 3, 2011; Published January 4, 2012 Copyright: 2012 Djebali et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work has been supported by grants U01HG003150 and U01HG003147 from the National Human Genome Research for the ENCODE project (http:// www.genome.gov/10005107) to RG (CRG, Lausanne and Geneva) and TG (Affymetrix and Cold Spring Harbor). RG was also supported by a grant from the Spanish Ministry of Education and Science (http://www.educacion.gob.es/portada.html). In addition, KS-A, RM, XY, CL, LG and MV were supported by a grant from the Ellison Foundation (http://www.ellisonfoundation.org/index.jsp) (to MV) and as Institute Sponsored Research from the Dana Farber Cancer Institute Strategic Initiative. They wish to thank Changyu Fan and Yun Shen for providing informatics support. The laboratories of SA and AR were supported by grants from the Swiss National Science Foundation (http://www.snf.ch/E/Pages/default.aspx), and the European Commission AnEUploidy Integrated Project (http://www.tigem.it/ scientific-office/eu-funded-projects-1/aneuploidy). SA was also supported by the National Center of Excellence ‘‘Frontiers in Genetics’’ (http://www.frontiers-ingenetics.org/page.php?id=profile_en), ChildCare Foundation (http://www.projectchildcarefoundation.org/) and an ERC grant from the European Union (http:// erc.europa.eu/). AF, TH and JM acknowledge support from the Wellcome Trust (http://www.wellcome.ac.uk/). The work of JLG and MO has been supported by the Spanish Ministry of Science (CTQ2005-09365-C02-02, BIO2009-10964), Instituto Nacional de Bioinformàtica (http://www.inab.org/), the Consolider E-science project (CSD2007-00050), COMBIOMED RETICS and the Fundación Marcelino Botin (http://www.fundacionmbotin.org/). JD is supported by a grant from the National Institutes of Health (HG003143) and a W. M. Keck Foundation Distinguished Young Scholar Award. JS is supported by a grant from the National Institutes of Health (U54 HG004592). The work of MT and AV was supported by Consolider E-Science (CSD2007-00050) and the Instituto Nacional de Bioinformàtica (http:// www.inab.org/). MV (Center for Cancer Systems Biology, CCSB) is a ‘‘Chercheur Qualifié Honoraire’’ from the Fonds de la Recherche Scientifique (FRS-FNRS, French Community of Belgium). This work has also been funded by grant to TG by NHGRI (U54HG004557) (http://www.genome.gov/10005107) and partially by Affymetirix, Corp (http://www.affymetrix.com/estore/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Affymetrix manufactured the arrays used and these were purchased at the sale price agreed to in the reported grant. Competing Interests: The study was partially funded by Affymetrix Inc. PK, SF, IB, ED, J.Drenkow, AD and TG were employees of Affymetrix Inc. at the time of the study. There are no patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and materials, as detailed online in the guide for authors. PLoS ONE | www.plosone.org 1 January 2012 | Volume 7 | Issue 1 | e28213 * E-mail: gingeras@cshl.edu (TG); roderic.guigo@crg.cat (RG); Stylianos.Antonarakis@unige.ch (SA) . These authors contributed equally to this work. ¤a Current address: Helicos BioSciences Corporation, Cambridge, Massachusetts, United States of America ¤b Current address: Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, INRIA BAMBOO, Villeurbanne, France ¤c Current address: Integromics, S.L., Grisolı́a, Tres Cantos, Madrid, Spain

[1]  Wei Li,et al.  Recurrent chimeric RNAs enriched in human prostate cancer identified by deep sequencing , 2011, Proceedings of the National Academy of Sciences.

[2]  Henrik Kaessmann,et al.  Origins, evolution, and phenotypic impact of new genes. , 2010, Genome research.

[3]  C Joel McManus,et al.  Global analysis of trans-splicing in Drosophila , 2010, Proceedings of the National Academy of Sciences.

[4]  David Tollervey,et al.  Apparent Non-Canonical Trans-Splicing Is Generated by Reverse Transcriptase In Vitro , 2010, PloS one.

[5]  Jeffrey G. Reifenberger,et al.  Direct RNA sequencing , 2009, Nature.

[6]  T. Gingeras Implications of chimaeric non-co-linear transcripts , 2009, Nature.

[7]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[8]  S. Luo,et al.  Chimeric transcript discovery by paired-end transcriptome sequencing , 2009, Proceedings of the National Academy of Sciences.

[9]  Hua-sheng Xiao,et al.  Progress in the detection of human genome structural variations , 2009, Science in China Series C: Life Sciences.

[10]  E. Liu,et al.  Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. , 2009, Genome research.

[11]  Richard Durbin,et al.  A large genome center's improvements to the Illumina sequencing system , 2008, Nature Methods.

[12]  J. Sklar,et al.  A Neoplastic Gene Fusion Mimics Trans-Splicing of RNAs in Normal Human Cells , 2008, Science.

[13]  Weidong Tian,et al.  Isoform discovery by targeted cloning, 'deep-well' pooling and parallel sequencing , 2008, Nature Methods.

[14]  M. Irimia,et al.  When good transcripts go bad: artifactual RT-PCR 'splicing' and genome analysis. , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[15]  Sylvain Foissac,et al.  Efficient targeted transcript discovery via array-based normalization of RACE libraries , 2008, Nature Methods.

[16]  Charlotte N. Henrichsen,et al.  Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. , 2007, Genome research.

[17]  J. Claverie,et al.  Tentative Mapping of Transcription-Induced Interchromosomal Interaction using Chimeric EST and mRNA Data , 2007, PloS one.

[18]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[19]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[20]  R. Veitia,et al.  Reverse transcriptase template switching and false alternative transcripts. , 2006, Genomics.

[21]  Raya Khanin,et al.  How Scale-Free Are Biological Networks , 2006, J. Comput. Biol..

[22]  R. Sorek,et al.  Transcription-mediated gene fusion in the human genome. , 2005, Genome research.

[23]  A. Reymond,et al.  Tandem chimerism as a means to increase protein complexity in the human genome. , 2005, Genome research.

[24]  Philipp Kapranov,et al.  Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. , 2005, Genome research.

[25]  T. Hubbard,et al.  The Vertebrate Genome Annotation (Vega) database , 2004, Nucleic Acids Res..

[26]  Vivek Iyer,et al.  The otter annotation system. , 2004, Genome research.

[27]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[28]  E L Sonnhammer,et al.  Integrated graphical analysis of protein sequence features predicted from sequence composition , 2001, Proteins.

[29]  V. Pathak,et al.  Relative Rates of Retroviral Reverse Transcriptase Template Switching during RNA- and DNA-Dependent DNA Synthesis , 1998, Journal of Virology.

[30]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[31]  A. Pombo,et al.  The localization of sites containing nascent RNA and splicing factors. , 1996, Experimental cell research.

[32]  H. Temin,et al.  Retrovirus variation and reverse transcription: abnormal strand transfers result in retrovirus genetic variation. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[33]  D. Jackson,et al.  Visualization of focal sites of transcription within human nuclei. , 1993, The EMBO journal.

[34]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[35]  David Baltimore,et al.  A detailed model of reverse transcription and tests of crucial aspects , 1979, Cell.

[36]  Xin Li,et al.  Short Homologous Sequences Are Strongly Associated with the Generation of Chimeric RNAs in Eukaryotes , 2008, Journal of Molecular Evolution.

[37]  M. Boutjdir,et al.  RNase protection assay for quantifying gene expression levels. , 2007, Methods in molecular biology.

[38]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[39]  Andrew W. Dowsey,et al.  Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics , 2005 .

[40]  Ian Dunham,et al.  Reevaluating human gene annotation: a second-generation analysis of chromosome 22. , 2003, Genome research.

[41]  J. Thorvaldsen,et al.  Ribonuclease protection. , 2001, Methods in molecular biology.

[42]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[43]  M. Vidal,et al.  GATEWAY recombinational cloning: application to the cloning of large numbers of open reading frames or ORFeomes. , 2000, Methods in enzymology.

[44]  Jean Thierry-Mieg,et al.  The ACEDB genome database , 1994 .