Methy-Pipe: An Integrated Bioinformatics Pipeline for Whole Genome Bisulfite Sequencing Data Analysis

DNA methylation, one of the most important epigenetic modifications, plays a crucial role in various biological processes. The level of DNA methylation can be measured using whole-genome bisulfite sequencing at single base resolution. However, until now, there is a paucity of publicly available software for carrying out integrated methylation data analysis. In this study, we implemented Methy-Pipe, which not only fulfills the core data analysis requirements (e.g. sequence alignment, differential methylation analysis, etc.) but also provides useful tools for methylation data annotation and visualization. Specifically, it uses Burrow-Wheeler Transform (BWT) algorithm to directly align bisulfite sequencing reads to a reference genome and implements a novel sliding window based approach with statistical methods for the identification of differentially methylated regions (DMRs). The capability of processing data parallelly allows it to outperform a number of other bisulfite alignment software packages. To demonstrate its utility and performance, we applied it to both real and simulated bisulfite sequencing datasets. The results indicate that Methy-Pipe can accurately estimate methylation densities, identify DMRs and provide a variety of utility programs for downstream methylation data analysis. In summary, Methy-Pipe is a useful pipeline that can process whole genome bisulfite sequencing data in an efficient, accurate, and user-friendly manner. Software and test dataset are available at http://sunlab.lihs.cuhk.edu.hk/methy-pipe/.

[1]  Peiyong Jiang,et al.  Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing , 2013, Proceedings of the National Academy of Sciences.

[2]  Peiyong Jiang,et al.  Noninvasive prenatal methylomic analysis by genomewide bisulfite sequencing of maternal plasma DNA. , 2013, Clinical chemistry.

[3]  J. Viegas Profile of Dennis Lo , 2013, Proceedings of the National Academy of Sciences.

[4]  Matthew D. Schultz,et al.  Global Epigenomic Reconfiguration During Mammalian Brain Development , 2013, Science.

[5]  Wendy P Robinson,et al.  The human placenta methylome , 2013, Proceedings of the National Academy of Sciences.

[6]  T. Benoukraf,et al.  GBSA: a comprehensive software for analysing whole genome bisulfite sequencing data , 2012, Nucleic acids research.

[7]  B. Langmead,et al.  BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions , 2012, Genome Biology.

[8]  W. Sung,et al.  BatMeth: improved mapper for bisulfite sequencing reads on DNA methylation , 2012, Genome Biology.

[9]  Francine E. Garrett-Bakelman,et al.  methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles , 2012, Genome Biology.

[10]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[11]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[12]  R. Stewart,et al.  Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells , 2011, Nature.

[13]  Pao-Yang Chen,et al.  BS Seeker: precise mapping for bisulfite sequencing , 2010, BMC Bioinformatics.

[14]  Peter L Molloy,et al.  Hypomethylation of repeated DNA sequences in cancer. , 2010, Epigenomics.

[15]  Chia-Lin Wei,et al.  Dynamic changes in the human methylome during differentiation. , 2010, Genome research.

[16]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[17]  Siu-Ming Yiu,et al.  High Throughput Short Read Alignment via Bi-directional BWT , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[18]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[19]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[20]  Wei Li,et al.  BSMAP: whole genome bisulfite sequence MAPping program , 2009, BMC Bioinformatics.

[21]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[22]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[23]  R. Lister,et al.  Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis , 2008, Cell.

[24]  S. Nelson,et al.  Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning , 2008, Nature.

[25]  C. Ding,et al.  Systematic search for placental DNA-methylation markers on chromosome 21: toward a maternal plasma-based epigenetic test for fetal trisomy 21. , 2008, Clinical chemistry.

[26]  R. Yuen,et al.  Hypermethylation of RASSF1A in human and rhesus placentas. , 2007, The American journal of pathology.

[27]  A. Gnirke,et al.  Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis , 2005, Nucleic acids research.

[28]  R. Yung,et al.  DNA methylation and the regulation of gene transcription , 2002, Cellular and Molecular Life Sciences CMLS.

[29]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.