P-Hint-Hunt: a deep parallelized whole genome DNA methylation detection tool

BackgroundThe increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today’s relative tools almost suffer from inaccuracies and time-consuming problems.ResultsIn our study, we designed a new DNA methylation prediction tool (“Hint-Hunt”) to solve the problem. By having an optimal complex alignment computation and Smith-Waterman matrix dynamic programming, Hint-Hunt could analyze and predict the DNA methylation status. But when Hint-Hunt tried to predict DNA methylation status with large-scale dataset, there are still slow speed and low temporal-spatial efficiency problems. In order to solve the problems of Smith-Waterman dynamic programming and low temporal-spatial efficiency, we further design a deep parallelized whole genome DNA methylation detection tool (“P-Hint-Hunt”) on Tianhe-2 (TH-2) supercomputer.ConclusionsTo the best of our knowledge, P-Hint-Hunt is the first parallel DNA methylation detection tool with a high speed-up to process large-scale dataset, and could run both on CPU and Intel Xeon Phi coprocessors. Moreover, we deploy and evaluate Hint-Hunt and P-Hint-Hunt on TH-2 supercomputer in different scales. The experimental results illuminate our tools eliminate the deviation caused by bisulfite treatment in mapping procedure and the multi-level parallel program yields a 48 times speed-up with 64 threads. P-Hint-Hunt gain a deep acceleration on CPU and Intel Xeon Phi heterogeneous platform, which gives full play of the advantages of multi-cores (CPU) and many-cores (Phi).

[1]  David T. W. Jones,et al.  Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing , 2014, Nature.

[2]  S. Duthie Folate and cancer: how DNA damage, repair and methylation impact on colon carcinogenesis , 2011, Journal of Inherited Metabolic Disease.

[3]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[4]  Zachary D. Smith,et al.  A unique regulatory phase of DNA methylation in the early mammalian embryo , 2012, Nature.

[5]  References , 1971 .

[6]  F. Tang,et al.  The DNA methylation landscape of human early embryos , 2014, Nature.

[7]  F. Coppedè,et al.  Epigenetic biomarkers of colorectal cancer: Focus on DNA methylation. , 2014, Cancer letters.

[8]  Matthew D. Schultz,et al.  Global Epigenomic Reconfiguration During Mammalian Brain Development , 2013, Science.

[9]  M. Surani,et al.  DNA methylation dynamics during the mammalian life cycle , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[10]  Wei Li,et al.  Large conserved domains of low DNA methylation maintained by Dnmt3a , 2013, Nature Genetics.

[11]  Wei Li,et al.  BSMAP: whole genome bisulfite sequence MAPping program , 2009, BMC Bioinformatics.

[12]  Volker Hovestadt,et al.  Robust molecular subgrouping and copy-number profiling of medulloblastoma from small amounts of archival tumour material using high-density DNA methylation arrays , 2013, Acta Neuropathologica.

[13]  Aviv Regev,et al.  DNA methylation dynamics of the human preimplantation embryo , 2014, Nature.