Antisense Transcription in the Mammalian Transcriptome RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group) and the FANTOM Consortium

Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involvingdegradation of the corresponding sense transcripts (RNA interference), as wellas gene silencing at the chromatin level. Global transcriptome analysisprovides evidence that a large proportion of the genome can producetranscripts from both strands, and that antisense transcripts commonly linkneighboring ‘‘genes’’ in complex loci into chains of linked transcriptional units.Expression profiling reveals frequent concordant regulation of sense/antisensepairs. We present experimental evidence that perturbation of an antisenseRNA can alter the expression of sense messenger RNAs, suggesting thatantisense transcription contributes to control of transcriptional outputs inmammals.The sense strand of DNA generally providesthe template for production of mRNA, whichin turn encodes proteins. Transcription fromthe opposite (antisense) strand can producetranscripts that hybridize with the codingDNA strand, or with the antisense tran-script, to interfere with transcription or mRNAstability.Although previous analysis of the mamma-lian transcriptome suggested that up to 20% oftranscripts may contribute to sense-antisense(S/AS) pairs (1–3), large-scale cDNA sequenc-ingintheFANTOM3project(4) suggests thatantisense transcription is more widespread. Toelucidate the function of S/AS pairs, we usedthe FANTOM3 data set to analyze their loca-tion in the mouse genome, the extent and posi-tion of their overlap, and promoter architectureand regulation (4).Analysis of the imprinted gnas locus in micedemonstrated numerous sense and antisensetranscripts expressed selectively depending onparental chromosomal origin ( 5). However,paired S/AS expression is not restricted toimprinted loci. For example, fig. S1 shows thecomplex transcript overlap patterns of the HoxAlocus and complex transcript overlap patterns. Toanalyze such complex loci on a genomewidescale, we generated a cDNA set comprising158,807 full-length transcripts obtained bymerging the 102,801 Fantom-3 cDNA set(http://fantom3.gsc.riken.jp/db/) with mousecDNAs from GenBank (www.ncbi.nlm.nih.gov/Genbank/) and clustering the cDNAs intotranscriptional units (TUs), in which mem-bers share sequence transcribed from thesame strand. There were 50,111 overlappingtranscript pairs, grouped into 29,780 nonre-dundant different overlapping regions in 8331TU pairs (9713 distinct representative over-lapping regions).In the accompanying paper (4), transcrip-tion and termination sites were identified. Onthe basis of this information, more than 72%of all genome-mapped TUs (43,553) overlapwith some cDNA, 5¶ or 3¶ expressed sequencetag (EST) sequence, or tag or tag-pair regionmapped to the opposite strand (Table 1). Fromthe above data, 4520 TU pairs contain full-length transcripts, which form S/AS pairs onexons (Table 2). S/AS interaction might alsooccur between immature RNAs (heteroge-neous nuclear RNA, hnRNA) in the nucleus.Furthermore, introns themselves can origi-nate smaller RNA with biological activity(6). In addition to transcript pairs that shareexons in opposite orientations, 4129 TUpairs were transcribed from different strandsof the same locus without apparently sharingoverlapping exons (Table 2). Although con-servative, the combined S/AS prediction is1.5- to 2-fold greater than that from previ-ous studies of mouse (1) and human (2, 3, 7)transcripts.Overlaps of cis S/AS pairs can target dif-ferent portions of the corresponding TU,giving rise to three basic types of S/AS pairs(fig. S2): head-to-head or divergent (D), tail-to-tail or convergent (C), and fully overlapping (F).The relative abundance of these classes isshown in Table 3. The divergent (head-to-head)classes are the most prevalent, contrasting toprevious studies emphasizing convergent cisS/AS pairs (3¶-3¶ end) (2, 8, 9). For exam-ple, the insulin-like growth factor 1 receptor(IGF1R) shows a very strong antisense CAGEtag overlapping the promoter of the sensetranscript, which provides a parallel to theAIR noncoding RNA (ncRNA) in the IGF2Rloci (10).S/AS phenomena affect different types ofgenes (tables S1 and S2) and are unevenlydistributed across the genome (table S3).Mouse chromosomes 4 and 17 show a S/ASpair density that is greater than average,whereas chromosomes 6, 9, and 13 show aS/AS pair density that is significantly lowerthan average (table S3). Chromosome 6 islargely homologous to human chromosome 7,which is known to be rich in RNAs transcribedby RNA polymerases I and III, a facet notcaptured by our approach (11). The X chromo-