Discovery of novel DNA cytosine deaminase activities enables a nondestructive single-enzyme methylation sequencing method for base resolution high-coverage methylome mapping of cell-free and ultra-low input DNA

Cytosine deaminases have important uses in the detection of epigenetic modifications and in genome editing. However, the range of applications of deaminases is limited by a small number of well characterized enzymes. To expand the toolkit of deaminases, we developed an in-vitro approach that bypasses a major hurdle with their severe toxicity in expression hosts. We systematically assayed the activity of 175 putative cytosine deaminases on an unprecedented variety of substrates with epigenetically relevant base modifications. We found enzymes with high activity on double- and single-stranded DNA in various sequence contexts including novel CpG-specific deaminases, as well as enzymes without sequence preference. We also report, for the first time, enzymes that do not deaminate modified cytosines. The remarkable diversity of cytosine deaminases opens new avenues for biotechnological and medical applications. Using a newly discovered non-specific, modification-sensitive double-stranded DNA deaminase, we developed a nondestructive single-enzyme 5-methylctyosine sequencing (SEM-seq) method. SEM-seq enables accurate, high-coverage, base-resolution methylome mapping of scarce biological material including clinically relevant cell-free DNA (cfDNA) and single-cell equivalent 10 pg input DNA. Using SEM-seq, we generated highly reproducible base-resolution 5mC maps, accounting for nearly 80% of CpG islands for a low input human cfDNA sample offering valuable information for identifying potential biomarkers for detection of early-stage cancer and other diseases. This streamlined protocol will enable robust, high-throughput, high-coverage epigenome profiling of challenging samples in research and clinical settings.

[1]  Kunli Qu,et al.  Discovery of deaminase functions by structure-based protein clustering , 2023, Cell.

[2]  Xianhua Wang,et al.  DddA homolog search and engineering expand sequence compatibility of mitochondrial base editing , 2023, Nature Communications.

[3]  Chuan He,et al.  Transcriptome-wide profiling and quantification of N^6-methyladenosine by enzyme-assisted adenosine deamination , 2023, Nature Biotechnology.

[4]  Albert J. Vilella,et al.  Simultaneous sequencing of genetic and epigenetic bases in DNA , 2022, bioRxiv.

[5]  Tony P. Huang,et al.  CRISPR-free base editors with enhanced activity and expanded targeting scope in mitochondrial and nuclear DNA , 2022, Nature Biotechnology.

[6]  P. Bailer,et al.  The base-editing enzyme APOBEC3A catalyzes cytosine deamination in RNA with low proficiency and high selectivity , 2021, bioRxiv.

[7]  S. Ovchinnikov,et al.  ColabFold: making protein folding accessible to all , 2022, Nature Methods.

[8]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[9]  T. C. Evans,et al.  Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA , 2021, Genome research.

[10]  E. Lianidou Detection and relevance of epigenetic markers on ctDNA: recent advances and future outlook , 2021, Molecular oncology.

[11]  P. Nielsen,et al.  Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing , 2021, Nature Communications.

[12]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[13]  Silvio C. E. Tosatto,et al.  Pfam: The protein families database in 2021 , 2020, Nucleic Acids Res..

[14]  I-Min A. Chen,et al.  The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities , 2020, Nucleic Acids Res..

[15]  Paul A. Wiggins,et al.  An interbacterial DNA deaminase toxin directly mutagenizes surviving target populations , 2020, bioRxiv.

[16]  C. Hayes,et al.  Polymorphic Toxins and Their Immunity Proteins: Diversity, Evolution, and Mechanisms of Delivery. , 2020, Annual review of microbiology.

[17]  Lloydm . Smith Faculty Opinions recommendation of DART-seq: an antibody-free method for global m6A detection. , 2020, Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature.

[18]  David R. Liu,et al.  A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing , 2020, Nature.

[19]  T. C. Evans,et al.  Nondestructive enzymatic deamination enables single-molecule long-read amplicon sequencing for the determination of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution , 2019, bioRxiv.

[20]  Robert D. Finn,et al.  MGnify: the microbiome analysis resource in 2020 , 2019, Nucleic Acids Res..

[21]  Kate D. Meyer DART-seq: an antibody-free method for global m6A detection , 2019, Nature Methods.

[22]  Alexey M. Kozlov,et al.  RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference , 2019, Bioinform..

[23]  J. Alroy,et al.  High diversity and rapid spatial turnover of integron gene cassettes in soil , 2019, Environmental microbiology.

[24]  Emily B Fabyanic,et al.  Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase , 2018, Nature Biotechnology.

[25]  J. Michael Cherry,et al.  The Encyclopedia of DNA elements (ENCODE): data portal update , 2017, Nucleic Acids Res..

[26]  Robert D. Finn,et al.  EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies , 2017, Nucleic Acids Res..

[27]  Fumiaki Ito,et al.  Family-Wide Comparative Analysis of Cytidine and Methylcytidine Deamination by Eleven Human APOBEC Proteins. , 2017, Journal of molecular biology.

[28]  I-Min A. Chen,et al.  IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses , 2016, Nucleic Acids Res..

[29]  Fidel Ramírez,et al.  deepTools2: a next generation web server for deep-sequencing data analysis , 2016, Nucleic Acids Res..

[30]  P. Bork,et al.  ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data , 2016, Molecular biology and evolution.

[31]  Wei Yang,et al.  BCORL1 is an independent prognostic marker and contributes to cell migration and invasion in human hepatocellular carcinoma , 2016, BMC Cancer.

[32]  Xiaodong Cheng,et al.  High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells. , 2013, Cell reports.

[33]  Vivek Anantharaman,et al.  Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics , 2012, Biology Direct.

[34]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[35]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[36]  L. Aravind,et al.  Evolution of the deaminase fold and multiple origins of eukaryotic editing and mutagenic nucleic acid deaminases from bacterial toxin systems , 2011, Nucleic acids research.

[37]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[38]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[39]  L. Black,et al.  A type IV modification dependent restriction nuclease that targets glucosylated hydroxymethyl cytosine modified DNAs. , 2007, Journal of molecular biology.

[40]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[41]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[42]  L. E. McDonald,et al.  A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Z. Liu,et al.  A Novel Double-Stranded DNA Deaminase-Based and Transcriptional Activator-Assisted Nuclear and Mitochondrial Cytosine Base Editors with Expanded Target Compatibility and Enhanced Activity , 2022, SSRN Electronic Journal.

[44]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..