CleaveLand: a pipeline for using degradome data to find cleaved small RNA targets

UNLABELLED MicroRNAs (miRNAs) are approximately 20- to 22-nt long endogenous RNA sequences that play a critical role in the regulation of gene expression in eukaryotic genomes. Confident identification of miRNA targets is vital to understand their functions. Currently available computational algorithms for miRNA target prediction have diverse degrees of sensitivity and specificity and as a consequence each predicted target generally requires experimental confirmation. miRNAs and other small RNAs that direct endonucleolytic cleavage of target mRNAs produce diagnostic uncapped, polyadenylated mRNA fragments. Degradome sequencing [also known as PARE (parallel analysis of RNA ends) and GMUCT (genome-wide mapping of uncapped transcripts)] samples the 5'-ends of uncapped mRNAs and can be used to discover in vivo miRNA targets independent of computational predictions. Here, we describe a generalizable computational pipeline, CleaveLand, for the detection of cleaved miRNA targets from degradome data. CleaveLand takes as input degradome sequences, small RNAs and an mRNA database and outputs small RNA targets. CleaveLand can thus be applied to degradome data from any species provided a set of mRNA transcripts and a set of query miRNAs or other small RNAs are available. AVAILABILITY The code and documentation for CleaveLand is freely available under a GNU license at http://www.bio.psu.edu/people/faculty/Axtell/AxtellLab/Software.html