Unification of Cas protein families and a simple scenario for the origin and evolution of CRISPR-Cas systems

BackgroundThe CRISPR-Cas adaptive immunity systems that are present in most Archaea and many Bacteria function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA. Comparative analysis of the Cas protein sequences and structures led to the classification of the CRISPR-Cas systems into three Types (I, II and III).ResultsA detailed comparison of the available sequences and structures of Cas proteins revealed several unnoticed homologous relationships. The Repeat-Associated Mysterious Proteins (RAMPs) containing a distinct form of the RNA Recognition Motif (RRM) domain, which are major components of the CRISPR-Cas systems, were classified into three large groups, Cas5, Cas6 and Cas7. Each of these groups includes many previously uncharacterized proteins now shown to adopt the RAMP structure. Evidence is presented that large subunits contained in most of the CRISPR-Cas systems could be homologous to Cas10 proteins which contain a polymerase-like Palm domain and are predicted to be enzymatically active in Type III CRISPR-Cas systems but inactivated in Type I systems. These findings, the fact that the CRISPR polymerases, RAMPs and Cas2 all contain core RRM domains, and distinct gene arrangements in the three types of CRISPR-Cas systems together provide for a simple scenario for origin and evolution of the CRISPR-Cas machinery. Under this scenario, the CRISPR-Cas system originated in thermophilic Archaea and subsequently spread horizontally among prokaryotes.ConclusionsBecause of the extreme diversity of CRISPR-Cas systems, in-depth sequence and structure comparison continue to reveal unexpected homologous relationship among Cas proteins. Unification of Cas protein families previously considered unrelated provides for improvement in the classification of CRISPR-Cas systems and a reconstruction of their evolution.Open peer reviewThis article was reviewed by Malcolm White (nominated by Purficacion Lopez-Garcia), Frank Eisenhaber and Igor Zhulin. For the full reviews, see the Reviewers' Comments section.

[1]  Katarzyna H. Kaminska,et al.  Cell death upon epigenetic genome methylation: a novel function of methyl-specific deoxyribonucleases , 2008, Genome Biology.

[2]  N. Grishin,et al.  A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action , 2006, Biology Direct.

[3]  Johannes Söding,et al.  HHsenser: exhaustive transitive profile search using HMM–HMM comparison , 2006, Nucleic Acids Res..

[4]  L. Marraffini,et al.  CRISPR Interference Limits Horizontal Gene Transfer in Staphylococci by Targeting DNA , 2008, Science.

[5]  Philippe Horvath,et al.  Cas3 is a single‐stranded DNA nuclease and ATP‐dependent helicase in the CRISPR/Cas immune system , 2011, The EMBO journal.

[6]  M. F. White,et al.  Structural and Functional Characterization of an Archaeal Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-associated Complex for Antiviral Defense (CASCADE)* , 2011, The Journal of Biological Chemistry.

[7]  N. Grishin,et al.  GGDEF domain is homologous to adenylyl cyclase , 2001, Proteins.

[8]  Albert J R Heck,et al.  Structural basis for CRISPR RNA-guided DNA recognition by Cascade , 2011, Nature Structural &Molecular Biology.

[9]  Michael Y. Galperin Structural Classification of Bacterial Response Regulators: Diversity of Output Domains and Domain Combinations , 2006, Journal of bacteriology.

[10]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[11]  Liisa Holm,et al.  Dali server: conservation mapping in 3D , 2010, Nucleic Acids Res..

[12]  L. Van Melderen,et al.  Bacterial Toxin–Antitoxin Systems: More Than Selfish Entities? , 2009, PLoS genetics.

[13]  Sheena E. Radford,et al.  Structural and mechanistic basis of immunity toward endonuclease colicins , 1999, Nature Structural Biology.

[14]  Eugene V Koonin,et al.  Evolution of DNA polymerases: an inactivated polymerase-exonuclease module in Pol ε and a chimeric origin of eukaryotic polymerases from two classes of archaeal ancestors , 2009, Biology Direct.

[15]  J. Tainer,et al.  The structure of the CRISPR-associated protein Csa3 provides insight into the regulation of the CRISPR/Cas system. , 2011, Journal of molecular biology.

[16]  E. Koonin,et al.  Comprehensive comparative-genomic analysis of Type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes , 2009, Biology Direct.

[17]  Erik J. Sontheimer,et al.  Self vs. non-self discrimination during CRISPR RNA-directed immunity , 2009, Nature.

[18]  M. F. White Structure, function and evolution of the XPD family of iron-sulfur-containing 5'-->3' DNA helicases. , 2009, Biochemical Society transactions.

[19]  Stan J. J. Brouns,et al.  CRISPR-based adaptive and heritable immunity in prokaryotes. , 2009, Trends in biochemical sciences.

[20]  B. Graveley,et al.  RNA-Guided RNA Cleavage by a CRISPR RNA-Cas Protein Complex , 2009, Cell.

[21]  S. Yokoyama,et al.  X‐ray crystal structure of a CRISPR‐associated RAMP superfamily protein, Cmr5, from Thermus thermophilus HB8 , 2009, Proteins.

[22]  Andrew Emili,et al.  A dual function of the CRISPR–Cas system in bacterial antivirus immunity and DNA repair , 2011, Molecular microbiology.

[23]  E V Koonin,et al.  SURVEY AND SUMMARY: holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories. , 2000, Nucleic acids research.

[24]  L. Marraffini,et al.  Microbiology: Slicer for DNA , 2010, Nature.

[25]  E. Koonin,et al.  Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea , 2007, Biology Direct.

[26]  J. García-Martínez,et al.  Intervening Sequences of Regularly Spaced Prokaryotic Repeats Derive from Foreign Genetic Elements , 2005, Journal of Molecular Evolution.

[27]  Albert J R Heck,et al.  RNA-guided complex from a bacterial immune system enhances target recognition through seed sequence interactions , 2011, Proceedings of the National Academy of Sciences.

[28]  Philippe Horvath,et al.  The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA , 2010, Nature.

[29]  D. Daines,et al.  Identification and characterization of a nontypeable Haemophilus influenzae putative toxin-antitoxin locus , 2004, BMC Microbiology.

[30]  E. Koonin,et al.  Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements , 2009, Biology Direct.

[31]  G. Krauss,et al.  Characterization of the endonuclease SSO2001 from Sulfolobus solfataricus P2 , 2009, FEBS letters.

[32]  J. García-Martínez,et al.  Short motif sequences determine the targets of the prokaryotic CRISPR defence system. , 2009, Microbiology.

[33]  G. Storz,et al.  An antisense RNA controls synthesis of an SOS-induced toxin evolved from an antitoxin , 2007, Molecular microbiology.

[34]  Stan J. J. Brouns,et al.  Small CRISPR RNAs Guide Antiviral Defense in Prokaryotes , 2008, Science.

[35]  T. Csorba,et al.  RNA silencing: an antiviral mechanism. , 2009, Advances in virus research.

[36]  L. Marraffini,et al.  Invasive DNA, chopped and in the CRISPR. , 2009, Structure.

[37]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..

[38]  M. F. White,et al.  Resolving the relationships of resolving enzymes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[39]  E V Koonin,et al.  DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history. , 1999, Nucleic acids research.

[40]  I. Kobayashi Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution. , 2001, Nucleic acids research.

[41]  R. Terns,et al.  Binding and cleavage of CRISPR RNA by Cas6. , 2010, RNA.

[42]  K. Zhou,et al.  Structural basis for DNase activity of a conserved protein implicated in CRISPR-mediated genome defense. , 2009, Structure.

[43]  T. Steitz The structural basis of the transition from initiation to elongation phases of transcription, as well as translocation and strand separation, by T7 RNA polymerase. , 2004, Current opinion in structural biology.

[44]  Eugene V Koonin,et al.  Comparative genomics and evolution of proteins involved in RNA metabolism. , 2002, Nucleic acids research.

[45]  L. Aravind,et al.  A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases , 2008, Biology Direct.

[46]  R. Barrangou,et al.  CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes , 2007, Science.

[47]  E. Koonin,et al.  A Novel Family of Sequence-specific Endoribonucleases Associated with the Clustered Regularly Interspaced Short Palindromic Repeats* , 2008, Journal of Biological Chemistry.

[48]  Avinash Bhandoola,et al.  Biology Direct , 2006 .

[49]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[50]  E. Koonin,et al.  Trends in protein evolution inferred from sequence and structure analysis. , 2002, Current opinion in structural biology.

[51]  G. Krauss,et al.  SSO1450 – A CAS1 protein from Sulfolobus solfataricus P2 with high affinity for RNA and DNA , 2009, FEBS letters.

[52]  S. Ehrlich,et al.  Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. , 2005, Microbiology.

[53]  Stephen H. Bryant,et al.  CD-Search: protein domain annotations on the fly , 2004, Nucleic Acids Res..

[54]  R. Terns,et al.  Interaction of the Cas6 riboendonuclease with CRISPR RNAs: recognition and cleavage. , 2011, Structure.

[55]  L. Schouls,et al.  Identification of genes that are associated with DNA repeats in prokaryotes , 2002, Molecular microbiology.

[56]  Ibtissem Grissa,et al.  The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats , 2007, BMC Bioinformatics.

[57]  S F Altschul,et al.  Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases. , 1998, Trends in biochemical sciences.

[58]  E. Phizicky,et al.  tRNAHis guanylyltransferase adds G-1 to the 5' end of tRNAHis by recognition of the anticodon, one of several features unexpectedly shared with tRNA synthetases. , 2006, RNA.

[59]  Jennifer A. Doudna,et al.  Sequence- and Structure-Specific RNA Processing by a CRISPR Endonuclease , 2010, Science.

[60]  Daniel H. Haft,et al.  A Guild of 45 CRISPR-Associated (Cas) Protein Families and Multiple CRISPR/Cas Subtypes Exist in Prokaryotic Genomes , 2005, PLoS Comput. Biol..

[61]  L. Aravind,et al.  Presence of a classical RRM-fold palm domain in Thg1-type 3'- 5'nucleic acid polymerases and the origin of the GGDEF and CRISPR polymerase domains , 2010, Biology Direct.

[62]  T. Steitz,et al.  Accuracy, lesion bypass, strand displacement and translocation by DNA polymerases. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[63]  Thomas A Steitz,et al.  The Structural Mechanism of Translocation and Helicase Activity in T7 RNA Polymerase , 2004, Cell.

[64]  Nick V Grishin,et al.  A DNA repair system specific for thermophilic Archaea and bacteria predicted by genomic context analysis. , 2002, Nucleic acids research.

[65]  R. Terns,et al.  Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes. , 2008, Genes & development.

[66]  Stan J. J. Brouns,et al.  Evolution and classification of the CRISPR–Cas systems , 2011, Nature Reviews Microbiology.

[67]  J. Bujnicki,et al.  Identification of a single HNH active site in type IIS restriction endonuclease Eco31I. , 2007, Journal of molecular biology.

[68]  M. Nowotny,et al.  Retroviral integrase superfamily: the structural perspective , 2009, EMBO reports.

[69]  E. Koonin,et al.  A highly conserved family of inactivated archaeal B family DNA polymerases , 2008, Biology Direct.

[70]  E. Koonin,et al.  Potential genomic determinants of hyperthermophily. , 2003, Trends in genetics : TIG.