Accurate computational prediction of the transcribed strand of CRISPR non-coding RNAs

MOTIVATION CRISPR RNAs (crRNAs) are a type of small non-coding RNA that form a key part of an acquired immune system in prokaryotes. Specific prediction methods find crRNA-encoding loci in nearly half of sequenced bacterial, and three quarters of archaeal, species. These Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) arrays consist of repeat elements alternating with specific spacers. Generally one strand is transcribed, producing long pre-crRNAs, which are processed to short crRNAs that base pair with invading nucleic acids to facilitate their destruction. No current software for the discovery of CRISPR loci predicts the direction of crRNA transcription. RESULTS We have developed an algorithm that accurately predicts the strand of the resulting crRNAs. The method uses as input CRISPR repeat predictions. CRISPRDirection uses parameters that are calculated from the CRISPR repeat predictions and flanking sequences, which are combined by weighted voting. The prediction may use prior coding sequence annotation but this is not required. CRISPRDirection correctly predicted the orientation of 94% of a reference set of arrays. AVAILABILITY AND IMPLEMENTATION The Perl source code is freely available from http://bioanalysis.otago.ac.nz/CRISPRDirection.

[1]  Peter C. Fineran,et al.  Function and Regulation of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) / CRISPR Associated (Cas) Systems , 2012, Viruses.

[2]  Christine L. Sun,et al.  Phage mutations in response to CRISPR diversification in a bacterial population. , 2013, Environmental microbiology.

[3]  Kira S. Makarova,et al.  Comparative genomics of defense systems in archaea and bacteria , 2013, Nucleic acids research.

[4]  Rolf Backofen,et al.  CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems , 2013, Nucleic acids research.

[5]  Natalia N. Ivanova,et al.  The DOE-JGI Standard Operating Procedure for the Annotations of Microbial Genomes , 2009, Standards in genomic sciences.

[6]  Connor T. Skennerton,et al.  Crass: identification and reconstruction of CRISPR from unassembled metagenomic data , 2013, Nucleic acids research.

[7]  Stan J. J. Brouns,et al.  The rise and fall of CRISPRs – dynamics of spacer acquisition and loss , 2012, Molecular microbiology.

[8]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[9]  Jennifer A. Doudna,et al.  Sequence- and Structure-Specific RNA Processing by a CRISPR Endonuclease , 2010, Science.

[10]  Wayne M. Getz,et al.  Persisting Viral Sequences Shape Microbial CRISPR-based Immunity , 2012, PLoS Comput. Biol..

[11]  J. García-Martínez,et al.  Short motif sequences determine the targets of the prokaryotic CRISPR defence system. , 2009, Microbiology.

[12]  R. Garrett,et al.  Selective and hyperactive uptake of foreign DNA by adaptive immune systems of an archaeon via two distinct mechanisms , 2012, Molecular microbiology.

[13]  J. Doudna,et al.  RNA-guided genetic silencing systems in bacteria and archaea , 2012, Nature.

[14]  Katherine S. Pollard,et al.  The UCSC Archaeal Genome Browser , 2005, Nucleic Acids Res..

[15]  Andrew M. Smith,et al.  The UCSC Archaeal Genome Browser: 2012 update , 2011, Nucleic Acids Res..

[16]  Samuel H Sternberg,et al.  Mechanism of substrate selection by a highly specific CRISPR endoribonuclease. , 2012, RNA.

[17]  Stan J. J. Brouns,et al.  CRISPR Interference Directs Strand Specific Spacer Acquisition , 2012, PloS one.

[18]  Peter C. Fineran,et al.  Memory of viral infections by CRISPR-Cas adaptive immune systems: acquisition of new information. , 2012, Virology.

[19]  Rolf Backofen,et al.  Essential requirements for the detection and degradation of invaders by the Haloferax volcanii CRISPR/Cas system I-B , 2013, RNA biology.

[20]  Michael W Deem,et al.  Heterogeneous diversity of spacers within CRISPR (clustered regularly interspaced short palindromic repeats). , 2010, Physical review letters.

[21]  Shiraz A. Shah,et al.  Protospacer recognition motifs Mixed identities and functional diversity , 2013 .

[22]  Rotem Sorek,et al.  CRISPR-mediated adaptive immune systems in bacteria and archaea. , 2013, Annual review of biochemistry.

[23]  Haixu Tang,et al.  Diverse CRISPRs Evolving in Human Microbiomes , 2012, PLoS genetics.

[24]  Rodolphe Barrangou,et al.  The Population and Evolutionary Dynamics of Phage and Bacteria with CRISPR–Mediated Immunity , 2013, PLoS genetics.

[25]  Marko Djordjevic,et al.  Transcription, processing and function of CRISPR cassettes in Escherichia coli , 2010, Molecular microbiology.

[26]  Jesús García-Martínez,et al.  CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli , 2013, RNA biology.

[27]  Peter C. Fineran,et al.  Cytotoxic Chromosomal Targeting by CRISPR/Cas Systems Can Reshape Bacterial Genomes and Expel or Remodel Pathogenicity Islands , 2013, PLoS genetics.

[28]  Shiraz A. Shah,et al.  CRISPR/Cas and Cmr modules, mobility and evolution of adaptive immune systems. , 2011, Research in microbiology.

[29]  Nikos Kyrpides,et al.  CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats , 2007, BMC Bioinformatics.

[30]  V. Kunin,et al.  Evolutionary conservation of sequence and secondary structures in CRISPR repeats , 2007, Genome Biology.

[31]  Jörg Vogel,et al.  Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. , 2013, Molecular cell.

[32]  Avital Brodt,et al.  CRISPR loci reveal networks of gene exchange in archaea , 2011, Biology Direct.

[33]  Rolf Backofen,et al.  Two CRISPR-Cas systems inMethanosarcina mazeistrain Gö1 display common processing features despite belonging to different types I and III , 2013, RNA biology.

[34]  Sarah Neumann,et al.  CRISPR-Cas systems preferentially target the leading regions of MOBF conjugative plasmids , 2013, RNA biology.

[35]  Chris M. Brown,et al.  CRISPRTarget , 2013, RNA biology.

[36]  B. Graveley,et al.  RNA-Guided RNA Cleavage by a CRISPR RNA-Cas Protein Complex , 2009, Cell.

[37]  Rolf Backofen,et al.  CRISPR-Cas Systems in the Cyanobacterium Synechocystis sp. PCC6803 Exhibit Distinct Processing Pathways Involving at Least Two Cas6 and a Cmr2 Protein , 2013, PloS one.

[38]  Stan J. J. Brouns,et al.  Evolution and classification of the CRISPR–Cas systems , 2011, Nature Reviews Microbiology.

[39]  Rolf Wagner,et al.  Identification and characterization of E. coli CRISPR‐cas promoters and their silencing by H‐NS , 2010, Molecular microbiology.

[40]  R. Terns,et al.  Interaction of the Cas6 riboendonuclease with CRISPR RNAs: recognition and cleavage. , 2011, Structure.

[41]  Jacques Nicolas,et al.  CRISPI: a CRISPR interactive database , 2009, Bioinform..

[42]  Shiraz A. Shah,et al.  CRISPR families of the crenarchaeal genus Sulfolobus: bidirectional transcription and dynamic properties , 2009, Molecular microbiology.

[43]  Peter C. Fineran,et al.  Csy4 is responsible for CRISPR RNA processing in Pectobacterium atrosepticum , 2011, RNA biology.

[44]  Ibtissem Grissa,et al.  CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats , 2007, Nucleic Acids Res..

[45]  Robert C. Edgar,et al.  PILER-CR: Fast and accurate identification of CRISPR repeats , 2007, BMC Bioinformatics.

[46]  Peter C. Fineran,et al.  In Vivo Protein Interactions and Complex Formation in the Pectobacterium atrosepticum Subtype I-F CRISPR/Cas System , 2012, PloS one.

[47]  U. Qimron,et al.  Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli , 2012, Nucleic acids research.

[48]  R. Garrett,et al.  Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems when challenged with vector-borne viral and plasmid genes and protospacers , 2011, Molecular microbiology.