Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair

Non-homologous end-joining (NHEJ) plays an important role in double-strand break (DSB) repair of DNA. Recent studies have shown that the error patterns of NHEJ are strongly biased by sequence context, but these studies were based on relatively few templates. To investigate this more thoroughly, we systematically profiled ∼1.16 million independent mutational events resulting from CRISPR/Cas9-mediated cleavage and NHEJ-mediated DSB repair of 6,872 synthetic target sequences, introduced into a human cell line via lentiviral infection. We find that: 1) insertions are dominated by 1 bp events templated by sequence immediately upstream of the cleavage site, 2) deletions are predominantly associated with microhomology, and 3) targets exhibit variable but reproducible diversity with respect to the number and relative frequency of the mutational outcomes to which they give rise. From these data, we trained a model that uses local sequence context to predict the distribution of mutational outcomes. Exploiting the bias of NHEJ outcomes towards microhomology mediated events, we demonstrate the programming of deletion patterns by introducing microhomology to specific locations in the vicinity of the DSB site. We anticipate that our results will inform investigations of DSB repair mechanisms as well as the design of CRISPR/Cas9 experiments for diverse applications including genome-wide screens, gene therapy, lineage tracing and molecular recording.

[1]  M. Lieber,et al.  Non-homologous end joining often uses microhomology: implications for alternative end joining. , 2014, DNA repair.

[2]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[3]  Martin Sosic,et al.  Edlib: a C/C++ library for fast, exact sequence alignment using edit distance , 2016, bioRxiv.

[4]  Stephen Wilcox,et al.  An inducible lentiviral guide RNA platform enables the identification of tumor-essential genes and tumor-promoting mutations in vivo. , 2015, Cell reports.

[5]  Z. Suo,et al.  Bidirectional Degradation of DNA Cleavage Products Catalyzed by CRISPR/Cas9 , 2018, Journal of the American Chemical Society.

[6]  M. Lieber,et al.  The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. , 2010, Annual review of biochemistry.

[7]  L. Symington,et al.  Microhomology-Mediated End Joining: A Back-up Survival Mechanism or Dedicated Pathway? , 2015, Trends in biochemical sciences.

[8]  E. Lander,et al.  Development and Applications of CRISPR-Cas 9 for Genome Engineering , 2015 .

[9]  Jiajie Zhang,et al.  PEAR: a fast and accurate Illumina Paired-End reAd mergeR , 2013, Bioinform..

[10]  Leopold Parts,et al.  Mutations generated by repair of Cas9-induced double strand breaks are predictable from surrounding sequence , 2018, bioRxiv.

[11]  E. Lander,et al.  Identification and characterization of essential genes in the human genome , 2015, Science.

[12]  D. Roth,et al.  Modernizing the nonhomologous end-joining repertoire: alternative and classical NHEJ share the stage. , 2013, Annual review of genetics.

[13]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[14]  Xuebing Wu,et al.  kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences , 2017, bioRxiv.

[15]  Kirsten L. Frieda,et al.  Synthetic recording and in situ readout of lineage information in single cells , 2016, Nature.

[16]  Jennifer Doudna,et al.  RNA-programmed genome editing in human cells , 2013, eLife.

[17]  J. Doudna,et al.  A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity , 2012, Science.

[18]  A. McKenna,et al.  FlashFry: a fast and flexible tool for large-scale CRISPR target design , 2017, BMC Biology.

[19]  CRISPR/Cas9 cleavages in budding yeast reveal templated insertions and strand-specific insertion/deletion profiles , 2018, Proceedings of the National Academy of Sciences.

[20]  V. Myer,et al.  Characterization of the interplay between DNA repair and CRISPR/Cas9-induced DNA lesions at an endogenous locus , 2017, Nature Communications.

[21]  A. Heger,et al.  UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy , 2016, bioRxiv.

[22]  David K. Gifford,et al.  Author Correction: Predictable and precise template-free CRISPR editing of pathogenic variants , 2019, Nature.

[23]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[24]  A. Bradley,et al.  Repair of double-strand breaks induced by CRISPR–Cas9 leads to large deletions and complex rearrangements , 2018, Nature Biotechnology.

[25]  David K. Gifford,et al.  Predictable and precise template-free CRISPR editing of pathogenic variants , 2018, Nature.

[26]  Eli J. Fine,et al.  DNA targeting specificity of RNA-guided Cas9 nucleases , 2013, Nature Biotechnology.

[27]  A. McKenna,et al.  CRISPR/Cas9-Mediated Scanning for Regulatory Elements Required for HPRT1 Expression via Thousands of Large, Programmed Genomic Deletions. , 2017, American journal of human genetics.

[28]  M. Lieber,et al.  DNA Ligase IV Guides End-Processing Choice during Nonhomologous End Joining. , 2017, Cell reports.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Qikai Xu,et al.  Sources of Error in Mammalian Genetic Screens , 2016, G3: Genes, Genomes, Genetics.

[31]  Cole Trapnell,et al.  On the design of CRISPR-based single cell molecular screens , 2018, Nature Methods.

[32]  E. Lander,et al.  Development and Applications of CRISPR-Cas9 for Genome Engineering , 2014, Cell.

[33]  James E. DiCarlo,et al.  RNA-Guided Human Genome Engineering via Cas9 , 2013, Science.

[34]  T. Blundell,et al.  Different DNA End Configurations Dictate Which NHEJ Components Are Most Important for Joining Efficiency* , 2016, The Journal of Biological Chemistry.

[35]  Target-specific precision of CRISPR-mediated genome editing , 2018, bioRxiv.

[36]  Le Cong,et al.  Multiplex Genome Engineering Using CRISPR/Cas Systems , 2013, Science.

[37]  James A. Gagnon,et al.  Whole-organism lineage tracing by combinatorial and cumulative genome editing , 2016, Science.

[38]  Anob M. Chakrabarti,et al.  Target-Specific Precision of CRISPR-Mediated Genome Editing , 2018, bioRxiv.

[39]  A. May,et al.  DNA Repair Profiling Reveals Nonrandom Outcomes at Cas9-Mediated Breaks. , 2016, Molecular cell.

[40]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[41]  M. Bétermier,et al.  Is Non-Homologous End-Joining Really an Inherently Error-Prone Process? , 2014, PLoS genetics.

[42]  George M. Church,et al.  Developmental barcoding of whole mouse via homing CRISPR , 2018, Science.

[43]  J. Shendure,et al.  Identifying Novel Enhancer Elements with CRISPR-Based Screens. , 2018, ACS chemical biology.

[44]  Sangsu Bae,et al.  Microhomology-based choice of Cas9 nuclease target sites , 2014, Nature Methods.

[45]  Jennifer A. Doudna,et al.  Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage , 2016, Science.