CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.

[1]  N. Grishin,et al.  A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action , 2006, Biology Direct.

[2]  Daniel H. Haft,et al.  A Guild of 45 CRISPR-Associated (Cas) Protein Families and Multiple CRISPR/Cas Subtypes Exist in Prokaryotic Genomes , 2005, PLoS Comput. Biol..

[3]  R. Barrangou,et al.  CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes , 2007, Science.

[4]  Jacques Nicolas,et al.  Browsing repeats in genomes: Pygram and an application to non-coding region analysis , 2006, BMC Bioinformatics.

[5]  Robert C. Edgar,et al.  PILER-CR: Fast and accurate identification of CRISPR repeats , 2007, BMC Bioinformatics.

[6]  D van Soolingen,et al.  Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology , 1997, Journal of clinical microbiology.

[7]  J. Stoye,et al.  REPuter: the manifold applications of repeat analysis on a genomic scale. , 2001, Nucleic acids research.

[8]  A. Hüttenhofer,et al.  Identification of 86 candidates for small non-messenger RNAs from the archaeon Archaeoglobus fulgidus , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  R. Overbeek,et al.  Searching for patterns in genomic data. , 1997, Trends in genetics : TIG.

[10]  G. Vergnaud,et al.  CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. , 2005, Microbiology.

[11]  S. Ehrlich,et al.  Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. , 2005, Microbiology.

[12]  L. Schouls,et al.  Identification of genes that are associated with DNA repeats in prokaryotes , 2002, Molecular microbiology.

[13]  Igor Mokrousov,et al.  Efficient Discrimination within a Corynebacterium diphtheriae Epidemic Clonal Group by a Novel Macroarray-Based Method (print version) , 2005, Journal of Clinical Microbiology.

[14]  F. J. Mojica,et al.  Biological significance of a family of regularly spaced repeats in the genomes of Archaea, Bacteria and mitochondria , 2000, Molecular microbiology.

[15]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[16]  F. Rodríguez-Valera,et al.  Long stretches of short tandem repeats are present in the largest replicons of the Archaea Haloferax mediterranei and Haloferax volcanii and could be involved in replicon partitioning , 1995, Molecular microbiology.

[17]  K. Makino,et al.  Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product , 1987, Journal of bacteriology.

[18]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[19]  J. Musser,et al.  Rapid molecular genetic subtyping of serotype M1 group A Streptococcus strains. , 1999, Emerging infectious diseases.

[20]  Mark A. Ragan,et al.  The complete genome of the crenarchaeon Sulfolobus solfataricus P2 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Stefan Kurtz,et al.  REPuter: fast computation of maximal repeats in complete genomes , 1999, Bioinform..

[22]  J. García-Martínez,et al.  Intervening Sequences of Regularly Spaced Prokaryotic Repeats Derive from Foreign Genetic Elements , 2005, Journal of Molecular Evolution.

[23]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[24]  R. Garrett,et al.  A putative viral defence mechanism in archaeal cells. , 2006, Archaea.

[25]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[26]  Enno Ohlebusch,et al.  Computation and Visualization of Degenerate Repeats in Complete Genomes , 2000, ISMB.

[27]  R. Garrett,et al.  Genus-Specific Protein Binding to the Large Clusters of DNA Repeats (Short Regularly Spaced Repeats) Present in Sulfolobus Genomes , 2003, Journal of bacteriology.

[28]  L. Schouls,et al.  Identification of a novel family of sequence repeats among prokaryotes. , 2002, Omics : a journal of integrative biology.

[29]  Ruud Jansen,et al.  Genetic Variation and Evolutionary Origin of the Direct Repeat Locus of Mycobacterium tuberculosis Complex Bacteria , 2000, Journal of bacteriology.

[30]  J. S. Godde,et al.  The Repetitive DNA Elements Called CRISPRs and Their Associated Genes: Evidence of Horizontal Transfer Among Prokaryotes , 2006, Journal of Molecular Evolution.

[31]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .