Development of a set of C•G-to-G•C transversion base editors from CRISPRi screens, target-library analysis, and machine learning

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have been difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPRi screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R=0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE-sgRNA pairs enables high-purity transversion base editing at >4-fold more target sites than can be achieved using any single CGBE variant.

[1]  Jeffrey A. Hussmann,et al.  Mapping the genetic landscape of DNA double-strand break repair , 2021, Cell.

[2]  Jung-Eun Park,et al.  Programmable C:G to G:C genome editing with CRISPR-Cas9-directed base excision repair proteins , 2021, Nature Communications.

[3]  Ahmed Allam,et al.  Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens , 2020, Nature Communications.

[4]  N. Mailand,et al.  The ubiquitin ligase RFWD3 is required for translesion DNA synthesis. , 2020, Molecular cell.

[5]  J. Keith Joung,et al.  CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells , 2020, Nature Biotechnology.

[6]  Xueli Zhang,et al.  Glycosylase base editors enable C-to-A and C-to-G base changes , 2020, Nature Biotechnology.

[7]  David R. Liu,et al.  Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors , 2020, Nature Biotechnology.

[8]  Christopher A. Cassa,et al.  Determinants of Base Editing Outcomes from Target Library Analysis and Machine Learning , 2020, Cell.

[9]  David R. Liu,et al.  A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing , 2020, Nature.

[10]  Jonathan Yen,et al.  Directed evolution of adenine base editors with increased activity and therapeutic application , 2020, Nature Biotechnology.

[11]  Jennifer A. Doudna,et al.  Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity , 2020, Nature Biotechnology.

[12]  David R. Liu,et al.  Cytosine and adenine base editing of the brain, liver, retina, heart and skeletal muscle of mice via adeno-associated viruses , 2019, Nature Biomedical Engineering.

[13]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[14]  David R. Liu,et al.  Search-and-replace genome editing without double-strand breaks or donor DNA , 2019, Nature.

[15]  José A. Guerrero-Martínez,et al.  The Cornelia de Lange Syndrome-associated factor NIPBL interacts with BRD4 ET domain for transcription control of a common set of genes , 2019, Cell Death & Disease.

[16]  B. Oh,et al.  Covalent binding of uracil DNA glycosylase UdgX to abasic DNA upon uracil excision , 2019, Nature Chemical Biology.

[17]  Matthew C. Canver,et al.  CRISPResso2 provides accurate and rapid genome editing sequence analysis , 2019, Nature Biotechnology.

[18]  Ye Yang,et al.  Suicide inactivation of the uracil DNA glycosylase UdgX by covalent complex formation , 2019, Nature Chemical Biology.

[19]  David K. Gifford,et al.  Predictable and precise template-free CRISPR editing of pathogenic variants , 2018, Nature.

[20]  David R. Liu,et al.  Base editing: precision chemistry on the genome and transcriptome of living cells , 2018, Nature Reviews Genetics.

[21]  Nozomu Yachie,et al.  Engineered CRISPR-Cas9 nuclease with expanded targeting space , 2018, Science.

[22]  Daesik Kim,et al.  Directed evolution of CRISPR-Cas9 to increase its specificity , 2017, Nature Communications.

[23]  Luca Pinello,et al.  An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities , 2018, Nature Biotechnology.

[24]  David R. Liu,et al.  Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction , 2018, Nature Biotechnology.

[25]  Nicole M. Gaudelli,et al.  Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage , 2017, Nature.

[26]  Jennifer A. Doudna,et al.  Enhanced proofreading governs CRISPR-Cas9 targeting accuracy , 2017, Nature.

[27]  Kevin T. Zhao,et al.  Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity , 2017, Science Advances.

[28]  Kevin T. Zhao,et al.  Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions , 2017, Nature Biotechnology.

[29]  Max A. Horlbeck,et al.  Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation , 2016, eLife.

[30]  A. Kondo,et al.  Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems , 2016, Science.

[31]  J. Keith Joung,et al.  731. High-Fidelity CRISPR-Cas9 Nucleases with No Detectable Genome-Wide Off-Target Effects , 2016 .

[32]  David R. Liu,et al.  Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage , 2016, Nature.

[33]  David A. Scott,et al.  Rationally engineered Cas9 nucleases with improved specificity , 2015, Science.

[34]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[35]  E. Woo,et al.  A unique uracil-DNA binding protein of the uracil DNA glycosylase superfamily , 2015, Nucleic acids research.

[36]  É. Mousseaux,et al.  The type of variants at the COL3A1 gene associates with the phenotype and severity of vascular Ehlers–Danlos syndrome , 2015, European Journal of Human Genetics.

[37]  Max A. Horlbeck,et al.  Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation , 2014, Cell.

[38]  Tatsunori B. Hashimoto,et al.  Discovery of non-directional and directional pioneer transcription factors by modeling DNase profile magnitude and shape , 2014, Nature Biotechnology.

[39]  Luke A. Gilbert,et al.  CRISPR-Mediated Modular RNA-Guided Regulation of Transcription in Eukaryotes , 2013, Cell.

[40]  G. Feldman,et al.  Hereditary breast and ovarian cancer due to mutations in BRCA1 and BRCA2 , 2010, Genetics in Medicine.

[41]  P. Stenson,et al.  Human Gene Mutation Database: towards a comprehensive central mutation database , 2007, Journal of Medical Genetics.

[42]  N. Rahman,et al.  NSD1 mutations are the major cause of Sotos syndrome and occur in some cases of Weaver syndrome but are rare in other overgrowth phenotypes. , 2003, American journal of human genetics.