microPIECE - microRNA pipeline enhanced by CLIP experiments

All microRNAs are assumed to be post-transcriptional fine-regulators. With a length of around 21 nucleotides, they form a RNA-induced silencing complex (RISC) complex with a protein of the Argonaute family. This complex then binds to the messengerRNA untranslated regions and coding sequence regions and in general promotes degradation or translational inhibition. It is now important to know the microRNA-mRNA pairs in order to infer dysregulating effects on the organism. In order to assign a microRNA to a mRNA target, various tools with different technical approaches were developed. They are mostly based on the assumption that the first eight nucleotides of the microRNA (seed region) determine the binding region on the mRNA. Some approaches also include supporting bindings in the rear part of the microRNA, others take secondary structures of the mRNA or binding energies of the mRNA-miRNA complex into account. Nevertheless, they all suffer from the statistical problem that such short target regions, often occur simply by chance in transcript sequences. This results in a huge amount of false positive predictions. A target prediction of all 590 Tribolium castaneum mature microRNAs from miRBase.org v22 (Kozomara and Griffiths-Jones 2013) against all 18.534 protein coding cDNA sequences from Ensembl.org (Ensembl Genomes release 38 December 2017) (Kinsella et al. 2011) results in 2.948.255 possible microRNA-target interactions, predicted by the commonly used tool miranda (Betel et al. 2008) with standard parameters. To increase the credibility, wet lab validation methods like luciferase reporter assays are required. The disadvantage here is that this workflow is not applicable for high-throughput analysis, as it can only treat small subsets of sequence combinations. Another, more scalable method is cross-linking immunoprecipitation-high-throughput sequencing (CLIPseq). Here, binding regions of the RISC show a specific signal in the sequencing reads that can be used to shrink the search space of miRNA target predictions, when mapping them to the transcriptome. The limitation here is the difficult technical treatment in the laboratory. This is the reason why there are only a few datasets available for human, mouse, worm and mosquito. It would now be useful, if we could simply transfer the information of a binding region, already identified by CLIP-seq, to another species. This is what our microRNA pipeline enhanced by CLIP experiments microPIECE is about.

[1]  Syed Haider,et al.  Ensembl BioMarts: a hub for data retrieval across taxonomic space , 2011, Database J. Biol. Databases Curation.

[2]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[3]  Fedor V. Karginov,et al.  Transcriptome-wide microRNA and target dynamics in the fat body during the gonadotrophic cycle of Aedes aegypti , 2017, Proceedings of the National Academy of Sciences.

[4]  Donald C. Chang,et al.  The MicroRNA. , 2018, Methods in molecular biology.

[5]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[8]  microPIECE (microRNA pipeline enhanced by CLIP experiments) , 2018 .

[9]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[10]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.

[11]  Doron Betel,et al.  The microRNA.org resource: targets and expression , 2007, Nucleic Acids Res..

[12]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[13]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[14]  Sonja J. Prohaska,et al.  Proteinortho: Detection of (Co-)orthologs in large-scale analysis , 2011, BMC Bioinformatics.

[15]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[16]  Daniel Amsel,et al.  Evaluation of high-throughput isomiR identification tools: illuminating the early isomiRome of Tribolium castaneum , 2017, BMC Bioinformatics.

[17]  Xavier Estivill,et al.  SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells , 2009, Nucleic acids research.

[18]  Andrew D. Smith,et al.  Site identification in high-throughput RNA-protein interaction data , 2012, Bioinform..