SpCas9 activity prediction by DeepSpCas9, a deep learning–based model with high generalization performance

A deep learning–based model predicts activity of a widely used genome editing tool in a highly accurate manner. We evaluated SpCas9 activities at 12,832 target sequences using a high-throughput approach based on a human cell library containing single-guide RNA–encoding and target sequence pairs. Deep learning–based training on this large dataset of SpCas9-induced indel frequencies led to the development of a SpCas9 activity–predicting model named DeepSpCas9. When tested against independently generated datasets (our own and those published by other groups), DeepSpCas9 showed high generalization performance. DeepSpCas9 is available at http://deepcrispr.info/DeepSpCas9.

[1]  J. L. Mateo,et al.  Refined sgRNA efficacy prediction improves large- and small-scale CRISPR–Cas9 applications , 2017, Nucleic acids research.

[2]  Clifford A. Meyer,et al.  Sequence determinants of improved CRISPR sgRNA design , 2015, Genome research.

[3]  Zhongzheng Cao,et al.  Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR–Cas9 library , 2016, Nature Biotechnology.

[4]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[5]  Seung Woo Cho,et al.  Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease , 2013, Nature Biotechnology.

[6]  Jennifer Doudna,et al.  RNA-programmed genome editing in human cells , 2013, eLife.

[7]  Kendall R. Sanson,et al.  Orthologous CRISPR-Cas9 enzymes for Combinatorial Genetic Screens , 2017, Nature Biotechnology.

[8]  Tessa G. Montague,et al.  Efficient Mutagenesis by Cas9 Protein-Mediated Oligonucleotide Insertion and Large-Scale Assessment of Single-Guide RNAs , 2014, PloS one.

[9]  David K. Gifford,et al.  Predictable and precise template-free CRISPR editing of pathogenic variants , 2018, Nature.

[10]  Aaron N. Chang,et al.  Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions , 2017, Nature Methods.

[11]  Feng Zhang,et al.  CRISPR-assisted editing of bacterial genomes , 2013, Nature Biotechnology.

[12]  Meagan E. Sullender,et al.  Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9 , 2015, Nature Biotechnology.

[13]  Yi Zheng,et al.  CRISPR/Cas9 cleavage efficiency regression through boosting algorithms and Markov sequence profiling , 2018, Bioinform..

[14]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Andrew R. Bassett,et al.  Predicting the mutations generated by repair of Cas9-induced double-strand breaks , 2018, Nature Biotechnology.

[16]  Jin-Wu Nam,et al.  In vivo high-throughput profiling of CRISPR–Cpf1 activity , 2016, Nature Methods.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  James E. DiCarlo,et al.  RNA-Guided Human Genome Engineering via Cas9 , 2013, Science.

[19]  Neville E. Sanjana,et al.  Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells , 2014, Science.

[20]  Neville E Sanjana,et al.  GUIDES: sgRNA design for loss-of-function screens , 2017, Nature Methods.

[21]  Timothy K Lu,et al.  Multiplexed barcoded CRISPR-Cas9 screening enabled by CombiGEM , 2016, Proceedings of the National Academy of Sciences.

[22]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[23]  J. Kent,et al.  Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR , 2016, Genome Biology.

[24]  Meagan E. Sullender,et al.  Rational design of highly active sgRNAs for CRISPR-Cas9–mediated gene inactivation , 2014, Nature Biotechnology.

[25]  Xiaowei Wang,et al.  WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system , 2015, Genome Biology.

[26]  G. Church,et al.  Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach , 2015, Nature Methods.

[27]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[28]  Le Cong,et al.  Multiplex Genome Engineering Using CRISPR/Cas Systems , 2013, Science.

[29]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[30]  Joana A. Vidigal,et al.  Rapid and efficient one-step generation of paired gRNA CRISPR-Cas9 libraries , 2015, Nature Communications.

[31]  D. Durocher,et al.  High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities , 2015, Cell.

[32]  Lei S. Qi,et al.  Genetic interaction mapping in mammalian cells using CRISPR interference , 2017, Nature Methods.

[33]  Jeffry D. Sander,et al.  Efficient In Vivo Genome Editing Using RNA-Guided Nucleases , 2013, Nature Biotechnology.

[34]  Alejandro Chavez,et al.  sgRNA Scorer 2.0: A Species-Independent Model To Predict CRISPR/Cas9 Activity. , 2017, ACS synthetic biology.

[35]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[36]  Sungroh Yoon,et al.  Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity , 2018, Nature Biotechnology.

[37]  Denis C. Bauer,et al.  High Activity Target-Site Identification Using Phenotypic Independent CRISPR-Cas9 Core Functionality. , 2018, The CRISPR journal.

[38]  Charles E. Vejnar,et al.  CRISPRscan: designing highly efficient sgRNAs for CRISPR/Cas9 targeting in vivo , 2015, Nature Methods.

[39]  Bo Huang,et al.  A systematic evaluation of nucleotide properties for CRISPR sgRNA design , 2017, BMC Bioinformatics.

[40]  Anob M. Chakrabarti,et al.  Target-Specific Precision of CRISPR-Mediated Genome Editing , 2018, bioRxiv.

[41]  E. Lander,et al.  Genetic Screens in Human Cells Using the CRISPR-Cas9 System , 2013, Science.

[42]  Irina M. Conboy Faculty Opinions recommendation of DeepCRISPR: optimized CRISPR guide RNA design by deep learning. , 2018 .

[43]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Kevin Bishop,et al.  High-throughput gene targeting and phenotyping in zebrafish using CRISPR/Cas9 , 2015, Genome research.