A computational approach for identifying microRNA-target interactions using high-throughput CLIP and PAR-CLIP sequencing

BackgroundMicroRNAs (miRNAs) play a critical role in down-regulating gene expression. By coupling with Argonaute family proteins, miRNAs bind to target sites on mRNAs and employ translational repression. A large amount of miRNA-target interactions (MTIs) have been identified by the crosslinking and immunoprecipitation (CLIP) and the photoactivatable-ribonucleoside-enhanced CLIP (PAR-CLIP) along with the next-generation sequencing (NGS). PAR-CLIP shows high efficiency of RNA co-immunoprecipitation, but it also lead to T to C conversion in miRNA-RNA-protein crosslinking regions. This artificial error obviously reduces the mappability of reads. However, a specific tool to analyze CLIP and PAR-CLIP data that takes T to C conversion into account is still in need.ResultsWe herein propose the first CLIP and PAR-CLIP sequencing analysis platform specifically for miRNA target analysis, namely miRTarCLIP. From scratch, it automatically removes adaptor sequences from raw reads, filters low quality reads, reverts C to T, aligns reads to 3'UTRs, scans for read clusters, identifies high confidence miRNA target sites, and provides annotations from external databases. With multi-threading techniques and our novel C to T reversion procedure, miRTarCLIP greatly reduces the running time comparing to conventional approaches. In addition, miRTarCLIP serves with a web-based interface to provide better user experiences in browsing and searching targets of interested miRNAs. To demonstrate the superior functionality of miRTarCLIP, we applied miRTarCLIP to two public available CLIP and PAR-CLIP sequencing datasets. miRTarCLIP not only shows comparable results to that of other existing tools in a much faster speed, but also reveals interesting features among these putative target sites. Specifically, we used miRTarCLIP to disclose that T to C conversion within position 1-7 and that within position 8-14 of miRNA target sites are significantly different (p value = 0.02), and even more significant when focusing on sites targeted by top 102 highly expressed miRNAs only (p value = 0.01). These results comply with previous findings and further suggest that combining miRNA expression and PAR-CLIP data can improve accuracy of the miRNA target prediction.ConclusionTo sum up, we devised a systematic approach for mining miRNA-target sites from CLIP-seq and PAR-CLIP sequencing data, and integrated the workflow with a graphical web-based browser, which provides a user friendly interface and detailed annotations of MTIs. We also showed through real-life examples that miRTarCLIP is a powerful tool for understanding miRNAs. Our integrated tool can be accessed online freely at http://miRTarCLIP.mbc.nctu.edu.tw.

[1]  Nectarios Koziris,et al.  TarBase 6.0: capturing the exponential growth of miRNA targets with experimental support , 2011, Nucleic Acids Res..

[2]  N. Rajewsky,et al.  Discovering microRNAs from deep sequencing data using miRDeep , 2008, Nature Biotechnology.

[3]  T. Tuschl,et al.  Structure of an argonaute silencing complex with a seed-containing guide DNA and target RNA duplex , 2008, Nature.

[4]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[5]  Hui Zhou,et al.  deepBase: a database for deeply annotating and mining deep sequencing data , 2009, Nucleic Acids Res..

[6]  M. Kiebler,et al.  Faculty Opinions recommendation of Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. , 2009 .

[7]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[8]  Scott B. Dewell,et al.  Transcriptome-wide Identification of RNA-Binding Protein and MicroRNA Target Sites by PAR-CLIP , 2010, Cell.

[9]  Xavier Estivill,et al.  SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells , 2009, Nucleic acids research.

[10]  Mihaela Zavolan,et al.  Inference of miRNA targets using evolutionary conservation and pathway analysis , 2007, BMC Bioinformatics.

[11]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[12]  Ana M. Aransay,et al.  miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments , 2009, Nucleic Acids Res..

[13]  Chi-Ying F. Huang,et al.  miRTarBase: a database curates experimentally validated microRNA–target interactions , 2010, Nucleic Acids Res..

[14]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.

[15]  Gene W. Yeo,et al.  Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis elegans , 2010, Nature Structural &Molecular Biology.

[16]  Robert Homann,et al.  Geoseq: a tool for dissecting deep-sequencing datasets , 2010, BMC Bioinformatics.

[17]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[18]  Dominic Grün,et al.  In vivo and transcriptome-wide identification of RNA binding protein target sites. , 2011, Molecular cell.

[19]  D. Bartel,et al.  Weak Seed-Pairing Stability and High Target-Site Abundance Decrease the Proficiency of lsy-6 and Other miRNAs , 2011, Nature Structural &Molecular Biology.

[20]  Uwe Ohler,et al.  Viral microRNA targetome of KSHV-infected primary effusion lymphoma cell lines. , 2011, Cell host & microbe.

[21]  Rasko Leinonen,et al.  The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..

[22]  Christoph Dieterich,et al.  doRiNA: a database of RNA interactions in post-transcriptional regulation , 2011, Nucleic Acids Res..

[23]  W. Filipowicz,et al.  The widespread regulation of microRNA biogenesis, function and decay , 2010, Nature Reviews Genetics.

[24]  Hui Zhou,et al.  starBase: a database for exploring microRNA–mRNA interaction maps from Argonaute CLIP-Seq and Degradome-Seq data , 2010, Nucleic Acids Res..

[25]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[26]  Grace X. Y. Zheng,et al.  Genome-wide identification of Ago2 binding sites from mouse embryonic stem cells with and without mature microRNAs , 2010, Nature Structural &Molecular Biology.

[27]  Uwe Ohler,et al.  PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data , 2011, Genome Biology.

[28]  Gang Xu,et al.  mirTools: microRNA profiling and discovery based on high-throughput sequencing , 2010, Nucleic Acids Res..

[29]  Hsien-Da Huang,et al.  miRExpress: Analyzing high-throughput sequencing data for profiling microRNA expression , 2009, BMC Bioinformatics.

[30]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[31]  Mohsen Khorshid,et al.  CLIPZ: a database and analysis environment for experimentally determined binding sites of RNA-binding proteins , 2010, Nucleic Acids Res..

[32]  Chi-Ching Lee,et al.  DSAP: deep-sequencing small RNA analysis pipeline , 2010, Nucleic Acids Res..

[33]  Eran Halperin,et al.  miRNAkey: a software for microRNA deep sequencing analysis , 2010, Bioinform..

[34]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[35]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.