A Wavelet Approach to Detect Enriched Regions and Explore Epigenomic Landscapes

Epigenetic landscapes represent how cells regulate gene activity. To understand their effect on gene regulation, it is important to detect their occupancy in the genome. Unlike transcription factors whose binding regions are limited to narrow regions, histone modification marks are enriched over broader areas. The stochastic characteristics unique to each mark make it hard to detect their enrichment. Classically, a predefined window has been used to detect their enrichment. However, these approaches heavily rely on the predetermined parameters. Also, the window-based approaches cannot handle the enrichment of multiple marks. We propose a novel algorithm, called SeqW, to detect enrichment of multiple histone modification marks. SeqW applies a zooming approach to detect a broadly enriched domain. The zooming approach helps domain detection by increasing signal-to-noise ratio. The borders of the domains are detected by studying the characteristics of signals in the wavelet domain. We show that SeqW outperformed previous predictors in detecting broad peaks. Also, we applied SeqW in studying spatial combinations of histone modification patterns.

[1]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[2]  Ryan A. Flynn,et al.  A unique chromatin signature uncovers early developmental enhancers in humans , 2011, Nature.

[3]  William Stafford Noble,et al.  Unsupervised segmentation of continuous genomic data , 2007, Bioinform..

[4]  Jason B. Ernst,et al.  Integrating multiple evidence sources to predict transcription factor binding in the human genome. , 2010, Genome research.

[5]  B. Ren,et al.  Genome-wide prediction of transcription factor binding sites using an integrated model , 2010, Genome Biology.

[6]  Jiayu Wen,et al.  Prediction of RNA Polymerase II recruitment, elongation and stalling from histone modification data , 2011, BMC Genomics.

[7]  Y. Kluger,et al.  Picking ChIP-seq peak detectors for analyzing chromatin modification experiments , 2012, Nucleic acids research.

[8]  Enrique Blanco,et al.  Genome-wide chromatin occupancy analysis reveals a role for ASH2 in transcriptional pausing , 2011, Nucleic acids research.

[9]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[10]  Andrew D. Smith,et al.  Bioinformatics Applications Note Gene Expression Identifying Dispersed Epigenomic Domains from Chip-seq Data , 2022 .

[11]  K. Zhao,et al.  Epigenome mapping in normal and disease States. , 2010, Circulation research.

[12]  B. Turner,et al.  Defining an epigenetic code , 2007, Nature Cell Biology.

[13]  An P. N. Vo,et al.  A wavelet-based method to exploit epigenomic language in the regulatory region , 2014, Bioinform..

[14]  R. Young,et al.  A Chromatin Landmark and Transcription Initiation at Most Promoters in Human Cells , 2007, Cell.

[15]  James A. Cuff,et al.  A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells , 2006, Cell.

[16]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[17]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[18]  Ting Wang,et al.  Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation , 2013, Genome research.

[19]  Eric S. Lander,et al.  Comparative Epigenomic Analysis of Murine and Human Adipogenesis , 2010, Cell.

[20]  Zhaohui S. Qin,et al.  HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data , 2010, BMC Bioinformatics.

[21]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[22]  Nha Nguyen,et al.  Gaussian derivative wavelets identify dynamic changes in histone modification , 2014, 2014 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology.

[23]  Heng Huang,et al.  Mass spectrometry data processing using zero-crossing lines in multi-scale of Gaussian derivative wavelet , 2010, Bioinform..

[24]  J. Ibrahim,et al.  ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions , 2011, Genome Biology.

[25]  Wei Wang,et al.  Comparative annotation of functional regions in the human genome using epigenomic data , 2013, Nucleic acids research.

[26]  Chen Zeng,et al.  A clustering approach for identification of enriched domains from histone modification ChIP-Seq data , 2009, Bioinform..

[27]  E. Rothenberg Faculty Opinions recommendation of Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010 .

[28]  Jun S. Song,et al.  Identifying Positioned Nucleosomes with Epigenetic Marks in Human from ChIP-Seq , 2008, BMC Genomics.

[29]  William Stafford Noble,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2012, Nature Methods.

[30]  Julia A. Lasserre,et al.  Histone modification levels are predictive for gene expression , 2010, Proceedings of the National Academy of Sciences.