On the Origin of Reverse Transcriptase-Using CRISPR-Cas Systems and Their Hyperdiverse, Enigmatic Spacer Repertoires

ABSTRACT Cas1 integrase is the key enzyme of the clustered regularly interspaced short palindromic repeat (CRISPR)-Cas adaptation module that mediates acquisition of spacers derived from foreign DNA by CRISPR arrays. In diverse bacteria, the cas1 gene is fused (or adjacent) to a gene encoding a reverse transcriptase (RT) related to group II intron RTs. An RT-Cas1 fusion protein has been recently shown to enable acquisition of CRISPR spacers from RNA. Phylogenetic analysis of the CRISPR-associated RTs demonstrates monophyly of the RT-Cas1 fusion, and coevolution of the RT and Cas1 domains. Nearly all such RTs are present within type III CRISPR-Cas loci, but their phylogeny does not parallel the CRISPR-Cas type classification, indicating that RT-Cas1 is an autonomous functional module that is disseminated by horizontal gene transfer and can function with diverse type III systems. To compare the sequence pools sampled by RT-Cas1-associated and RT-lacking CRISPR-Cas systems, we obtained samples of a commercially grown cyanobacterium—Arthrospira platensis. Sequencing of the CRISPR arrays uncovered a highly diverse population of spacers. Spacer diversity was particularly striking for the RT-Cas1-containing type III-B system, where no saturation was evident even with millions of sequences analyzed. In contrast, analysis of the RT-lacking type III-D system yielded a highly diverse pool but reached a point where fewer novel spacers were recovered as sequencing depth was increased. Matches could be identified for a small fraction of the non-RT-Cas1-associated spacers, and for only a single RT-Cas1-associated spacer. Thus, the principal source(s) of the spacers, particularly the hypervariable spacer repertoire of the RT-associated arrays, remains unknown. IMPORTANCE While the majority of CRISPR-Cas immune systems adapt to foreign genetic elements by capturing segments of invasive DNA, some systems carry reverse transcriptases (RTs) that enable adaptation to RNA molecules. From analysis of available bacterial sequence data, we find evidence that RT-based RNA adaptation machinery has been able to join with CRISPR-Cas immune systems in many, diverse bacterial species. To investigate whether the abilities to adapt to DNA and RNA molecules are utilized for defense against distinct classes of invaders in nature, we sequenced CRISPR arrays from samples of commercial-scale open-air cultures of Arthrospira platensis, a cyanobacterium that contains both RT-lacking and RT-containing CRISPR-Cas systems. We uncovered a diverse pool of naturally occurring immune memories, with the RT-lacking locus acquiring a number of segments matching known viral or bacterial genes, while the RT-containing locus has acquired spacers from a distinct sequence pool for which the source remains enigmatic. While the majority of CRISPR-Cas immune systems adapt to foreign genetic elements by capturing segments of invasive DNA, some systems carry reverse transcriptases (RTs) that enable adaptation to RNA molecules. From analysis of available bacterial sequence data, we find evidence that RT-based RNA adaptation machinery has been able to join with CRISPR-Cas immune systems in many, diverse bacterial species. To investigate whether the abilities to adapt to DNA and RNA molecules are utilized for defense against distinct classes of invaders in nature, we sequenced CRISPR arrays from samples of commercial-scale open-air cultures of Arthrospira platensis, a cyanobacterium that contains both RT-lacking and RT-containing CRISPR-Cas systems. We uncovered a diverse pool of naturally occurring immune memories, with the RT-lacking locus acquiring a number of segments matching known viral or bacterial genes, while the RT-containing locus has acquired spacers from a distinct sequence pool for which the source remains enigmatic.

[1]  Erik J. Sontheimer,et al.  Self vs. non-self discrimination during CRISPR RNA-directed immunity , 2009, Nature.

[2]  Wayne M. Getz,et al.  Persisting Viral Sequences Shape Microbial CRISPR-based Immunity , 2012, PLoS Comput. Biol..

[3]  R. Barrangou,et al.  CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes , 2007, Science.

[4]  Georgios A. Pavlopoulos,et al.  Uncovering Earth’s virome , 2016, Nature.

[5]  Joshua R. Elmore,et al.  Essential features and rational design of CRISPR RNAs that function with the Cas RAMP module complex to cleave RNAs. , 2012, Molecular cell.

[6]  Sita J. Saunders,et al.  An updated evolutionary classification of CRISPR–Cas systems , 2015, Nature Reviews Microbiology.

[7]  T. Vallaeys,et al.  An efficient DNA isolation protocol for filamentous cyanobacteria of the genus Arthrospira. , 2010, Journal of microbiological methods.

[8]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[9]  Anders F. Andersson,et al.  Virus Population Dynamics and Acquired Virus Resistance in Natural Microbial Communities , 2008, Science.

[10]  David S. Wishart,et al.  PHASTER: a better, faster version of the PHAST phage search tool , 2016, Nucleic Acids Res..

[11]  Robert C. Edgar,et al.  PILER-CR: Fast and accurate identification of CRISPR repeats , 2007, BMC Bioinformatics.

[12]  Peer Bork,et al.  Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses , 2016, Nature.

[13]  Luciano A. Marraffini,et al.  Degradation of Phage Transcripts by CRISPR-Associated RNases Enables Type III CRISPR-Cas Immunity , 2016, Cell.

[14]  I-Min A. Chen,et al.  IMG/M: integrated genome and metagenome comparative data analysis system , 2016, Nucleic Acids Res..

[15]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[16]  Wenyan Jiang,et al.  Impact of Different Target Sequences on Type III CRISPR-Cas Immunity , 2016, Journal of bacteriology.

[17]  Tyson A. Clark,et al.  Covalent Modification of Bacteriophage T4 DNA Inhibits CRISPR-Cas9 , 2015, mBio.

[18]  Stan J. J. Brouns,et al.  Small CRISPR RNAs Guide Antiviral Defense in Prokaryotes , 2008, Science.

[19]  A. Lambowitz,et al.  Mobile group II introns. , 2004, Annual review of genetics.

[20]  Kira S. Makarova,et al.  The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes , 2017, mBio.

[21]  Luciano A. Marraffini,et al.  Co-transcriptional DNA and RNA Cleavage during Type III CRISPR-Cas Immunity , 2015, Cell.

[22]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[23]  Sagi Snir,et al.  Defense Islands in Bacterial and Archaeal Genomes and Prediction of Novel Defense Systems , 2011, Journal of bacteriology.

[24]  Kira S. Makarova,et al.  Diversity and evolution of class 2 CRISPR–Cas systems , 2017, Nature Reviews Microbiology.

[25]  N. L. Held,et al.  Reassortment of CRISPR repeat-spacer loci in Sulfolobus islandicus. , 2013, Environmental microbiology.

[26]  Emmanuelle Charpentier,et al.  DNA and RNA interference mechanisms by CRISPR-Cas surveillance complexes , 2015, FEMS microbiology reviews.

[27]  Scott D Boyd,et al.  Convergent antibody signatures in human dengue. , 2013, Cell host & microbe.

[28]  Q. She,et al.  An archaeal CRISPR type III-B system exhibiting distinctive RNA targeting features and mediating dual RNA and DNA interference , 2014, Nucleic acids research.

[29]  A. Chao Nonparametric estimation of the number of classes in a population , 1984 .

[30]  M. Kanehisa,et al.  Systematic survey for novel types of prokaryotic retroelements based on gene neighborhood and protein architecture. , 2008, Molecular biology and evolution.

[31]  S. Koren,et al.  Diversity in a Polymicrobial Community Revealed by Analysis of Viromes, Endolysins and CRISPR Spacers , 2016, PloS one.

[32]  Ceslovas Venclovas,et al.  Programmable RNA shredding by the type III-A CRISPR-Cas system of Streptococcus thermophilus. , 2014, Molecular cell.

[33]  Alejandro A. Schäffer,et al.  Database indexing for production MegaBLAST searches , 2008, Bioinform..

[34]  Stan J. J. Brouns,et al.  CRISPR-Cas: Adapting to change , 2017, Science.

[35]  S. Zimmerly,et al.  A diversity of uncharacterized reverse transcriptases in bacteria , 2008, Nucleic acids research.

[36]  B. Graveley,et al.  RNA-Guided RNA Cleavage by a CRISPR RNA-Cas Protein Complex , 2009, Cell.

[37]  Eugene V Koonin,et al.  Discovery and Functional Characterization of Diverse Class 2 CRISPR-Cas Systems. , 2015, Molecular cell.

[38]  N. Toro,et al.  Comprehensive Phylogenetic Analysis of Bacterial Reverse Transcriptases , 2014, PloS one.

[39]  Ibtissem Grissa,et al.  CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats , 2007, Nucleic Acids Res..

[40]  Luciano A. Marraffini,et al.  Conditional tolerance of temperate phages via transcription-dependent CRISPR-Cas targeting , 2014, Nature.

[41]  Narmada Thanki,et al.  CDD: conserved domains and protein three-dimensional structure , 2012, Nucleic Acids Res..

[42]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[43]  M. Borodovsky,et al.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. , 2001, Nucleic acids research.

[44]  Ibtissem Grissa,et al.  The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats , 2007, BMC Bioinformatics.

[45]  Sergey A. Shmakov,et al.  The CRISPR spacer space is dominated by sequences from the species-specific mobilome , 2017, bioRxiv.

[46]  David Wang,et al.  Hyperexpansion of RNA Bacteriophage Diversity , 2016, PLoS biology.

[47]  P. Sharp,et al.  Replication of RNA by the DNA-dependent RNA polymerase of phage T7 , 1989, Cell.

[48]  Xu Peng,et al.  A novel interference mechanism by a type IIIB CRISPR‐Cmr module in Sulfolobus , 2013, Molecular microbiology.

[49]  N. Grishin,et al.  A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action , 2006, Biology Direct.

[50]  M. Grunberg‐Manago,et al.  Enzymic synthesis of polynucleotides. I. Polynucleotide phosphorylase of azotobacter vinelandii. , 1956, Biochimica et biophysica acta.

[51]  Rotem Sorek,et al.  CRISPR–Cas adaptation: insights into the mechanism of action , 2016, Nature Reviews Microbiology.

[52]  Susumu Goto,et al.  Linking Virus Genomes with Host Taxonomy , 2016, Viruses.

[53]  Mark M. Davis,et al.  Effects of Aging, Cytomegalovirus Infection, and EBV Infection on Human B Cell Repertoires , 2014, The Journal of Immunology.

[54]  I-Min A. Chen,et al.  IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses , 2016, Nucleic Acids Res..

[55]  Christine L. Sun,et al.  Metagenomic reconstructions of bacterial CRISPR loci constrain population histories , 2015, The ISME Journal.

[56]  G. Mohr,et al.  Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase–Cas1 fusion protein , 2016, Science.

[57]  Philippe Horvath,et al.  Phage Response to CRISPR-Encoded Resistance in Streptococcus thermophilus , 2007, Journal of bacteriology.

[58]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..