Msuite: A High-Performance and Versatile DNA Methylation Data-Analysis Toolkit

Summary DNA methylation is a pervasive and important epigenetic regulator in mammalian genome. For DNA methylome profiling, emerging bisulfite-free methods have demonstrated desirable superiority over the conventional bisulfite-treatment-based approaches, although current analysis software could not make full use of their advantages. In this work, we present Msuite, an easy-to-use, all-in-one data-analysis toolkit. Msuite implements a unique 4-letter analysis mode specifically optimized for emerging protocols; it also integrates quality controls, methylation call, and data visualizations. Msuite demonstrates substantial performance improvements over current state-of-the-art tools as well as fruitful functionalities, thus holding the potential to serve as an optimal toolkit to facilitate DNA methylome studies. Source codes and testing datasets for Msuite are freely available at https://github.com/hellosunking/Msuite/.

[1]  Peiyong Jiang,et al.  Size-tagged preferred ends in maternal plasma DNA shed light on the production mechanism and show utility in noninvasive prenatal testing , 2018, Proceedings of the National Academy of Sciences.

[2]  G. Fan,et al.  DNA Methylation and Its Basic Function , 2013, Neuropsychopharmacology.

[3]  Hao Sun,et al.  mTFkb: a knowledgebase for fundamental annotation of mouse transcription factors , 2017, Scientific Reports.

[4]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[5]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[6]  B. Langmead,et al.  BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions , 2012, Genome Biology.

[7]  Akimitsu Okamoto,et al.  Degradation of DNA by bisulfite treatment. , 2007, Bioorganic & medicinal chemistry letters.

[8]  E. Ma,et al.  Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments , 2015, Proceedings of the National Academy of Sciences.

[9]  Hao Sun,et al.  LncRNA Dum interacts with Dnmts to regulate Dppa2 expression during myogenic differentiation and muscle regeneration , 2015, Cell Research.

[10]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[11]  Qian Tao,et al.  Characterization of the nasopharyngeal carcinoma methylome identifies aberrant disruption of key signaling pathways and methylated tumor suppressor genes. , 2015, Epigenomics.

[12]  Peiyong Jiang,et al.  Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing , 2013, Proceedings of the National Academy of Sciences.

[13]  Theodore Sakellaropoulos,et al.  EpiMethylTag: simultaneous detection of ATAC-seq or ChIP-seq signals with DNA methylation , 2019, Genome Biology.

[14]  Peiyong Jiang,et al.  Noninvasive prenatal methylomic analysis by genomewide bisulfite sequencing of maternal plasma DNA. , 2013, Clinical chemistry.

[15]  Benjamin Schuster-Böckler,et al.  Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution , 2019, Nature Biotechnology.

[16]  E. Hodges,et al.  ATAC-Me Captures Prolonged DNA Methylation of Dynamic Chromatin Accessibility Loci during Cell Fate Transitions. , 2020, Molecular cell.

[17]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[18]  Bo Xia,et al.  Bisulfite-Free, Nanoscale Analysis of 5-Hydroxymethylcytosine at Single Base Resolution. , 2018, Journal of the American Chemical Society.

[19]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[20]  W. Reik,et al.  Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data , 2017, bioRxiv.

[21]  Shankar Balasubramanian,et al.  Mapping and elucidating the function of modified bases in DNA , 2017 .

[22]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[23]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[24]  Peiyong Jiang,et al.  A novel wnt regulatory axis in endometrioid endometrial cancer. , 2014, Cancer research.

[25]  Qi Zhao,et al.  Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma. , 2017, Nature materials.

[26]  Emily B Fabyanic,et al.  Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase , 2018, Nature Biotechnology.

[27]  Zachary D. Smith,et al.  DNA methylation: roles in mammalian development , 2013, Nature Reviews Genetics.

[28]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[29]  Hao Sun,et al.  Noninvasive reconstruction of placental methylome from maternal plasma DNA: Potential for prenatal testing and monitoring , 2018, Prenatal diagnosis.

[30]  Kun Sun,et al.  Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data , 2020, Bioinform..

[31]  Brent S. Pedersen,et al.  Fast and accurate alignment of long bisulfite-seq reads , 2014, 1401.1129.

[32]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[33]  Hao Sun,et al.  BSviewer: a genotype‐preserving, nucleotide‐level visualizer for bisulfite sequencing data , 2017, Bioinform..

[34]  Yan Lu,et al.  A comprehensive evaluation of alignment software for reduced representation bisulfite sequencing data , 2018, Bioinform..

[35]  Peiyong Jiang,et al.  Second generation noninvasive fetal genome analysis reveals de novo mutations, single-base parental inheritance, and preferred DNA ends , 2016, Proceedings of the National Academy of Sciences.

[36]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[37]  Peiyong Jiang,et al.  DNA of Erythroid Origin Is Present in Human Plasma and Informs the Types of Anemia. , 2017, Clinical chemistry.

[38]  Wanxia Gai,et al.  Epigenetic Biomarkers in Cell-Free DNA and Applications in Liquid Biopsy , 2019, Genes.

[39]  Ting Chen,et al.  WALT: fast and accurate read mapping for bisulfite sequencing , 2016, Bioinform..

[40]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[41]  Peiyong Jiang,et al.  Liver- and Colon-Specific DNA Methylation Markers in Plasma for Investigation of Colorectal Cancers with or without Liver Metastases. , 2018, Clinical chemistry.

[42]  Peiyong Jiang,et al.  Orientation-aware plasma cell-free DNA fragmentation analysis in open chromatin regions informs tissue of origin , 2019, Genome research.

[43]  A. Franke,et al.  DNA methylome analysis using short bisulfite sequencing data , 2012, Nature Methods.

[44]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[45]  Louise C Laurent,et al.  DNA methylation in embryonic stem cells , 2009, Journal of cellular biochemistry.

[46]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[47]  K. Sun,et al.  Methy-Pipe: An Integrated Bioinformatics Pipeline for Whole Genome Bisulfite Sequencing Data Analysis , 2014, PloS one.

[48]  Nathaniel D. Tippens,et al.  methyl-ATAC-seq measures DNA methylation at accessible chromatin , 2018, bioRxiv.