An introduction to computational tools for differential binding analysis with ChIP-seq data

BackgroundGene transcription in eukaryotic cells is collectively controlled by a large panel of chromatin associated proteins and ChIP-seq is now widely used to locate their binding sites along the whole genome. Inferring the differential binding sites of these proteins between biological conditions by comparing the corresponding ChIP-seq samples is of general interest, yet it is still a computationally challenging task.ResultsHere, we briefly review the computational tools developed in recent years for differential binding analysis with ChIP-seq data. The methods are extensively classified by their strategy of statistical modeling and scope of application. Finally, a decision tree is presented for choosing proper tools based on the specific dataset.ConclusionsComputational tools for differential binding analysis with ChIP-seq data vary significantly with respect to their applicability and performance. This review can serve as a practical guide for readers to select appropriate tools for their own datasets.

[1]  Manolis Kellis,et al.  Discovery and Characterization of Chromatin States for Systematic Annotation of the Human Genome , 2011, RECOMB.

[2]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[3]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[4]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[5]  E. Nestler,et al.  diffReps: Detecting Differential Chromatin Modification Sites from ChIP-seq Data with Biological Replicates , 2013, PloS one.

[6]  T. Furey ChIP – seq and beyond : new and improved methodologies to detect and characterize protein – DNA interactions , 2012 .

[7]  Michael Q. Zhang,et al.  Combinatorial patterns of histone acetylations and methylations in the human genome , 2008, Nature Genetics.

[8]  Hao Wu,et al.  A novel statistical method for quantitative comparison of multiple ChIP-seq datasets , 2015, Bioinform..

[9]  Ana Conesa,et al.  ARSyN: a method for the identification and removal of systematic noise in multifactorial time course microarray experiments. , 2012, Biostatistics.

[10]  W. Wong,et al.  ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells , 2009, Proceedings of the National Academy of Sciences.

[11]  M. Facciotti,et al.  Evaluation of Algorithm Performance in ChIP-Seq Peak Detection , 2010, PloS one.

[12]  W. Huber,et al.  Differential expression analysis for sequence count data , 2010 .

[13]  Roland Eils,et al.  A comprehensive comparison of tools for differential ChIP-seq analysis , 2016, Briefings Bioinform..

[14]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[15]  N. Friedman,et al.  Chromatin state dynamics during blood formation , 2014, Science.

[16]  P. Dorrestein,et al.  PHF8 Mediates Histone H4 Lysine 20 Demethylation Events Involved in Cell Cycle Progression , 2010, Nature.

[17]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[18]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[19]  Yong Zhang,et al.  Identifying ChIP-seq enrichment using MACS , 2012, Nature Protocols.

[20]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[21]  G. Smyth,et al.  Camera: a competitive gene set test accounting for inter-gene correlation , 2012, Nucleic acids research.

[22]  Luca Pinello,et al.  Combinatorial assembly of developmental stage-specific enhancers controls gene expression programs during human erythropoiesis. , 2012, Developmental cell.

[23]  Di Wu,et al.  ROAST: rotation gene set tests for complex microarray experiments , 2010, Bioinform..

[24]  Paul Flicek,et al.  regulatory modules cis occupied Cohesin regulates tissue-specific expression by stabilising highly , 2012 .

[25]  Ernest Fraenkel,et al.  Insights into GATA-1-mediated gene activation versus repression via genome-wide chromatin occupancy analysis. , 2009, Molecular cell.

[26]  H. Stunnenberg,et al.  BLUEPRINT: mapping human blood cell epigenomes , 2013, Haematologica.

[27]  Clifford A. Meyer,et al.  Identifying and mitigating bias in next-generation sequencing methods for chromatin biology , 2014, Nature Reviews Genetics.

[28]  Peter J. Bickel,et al.  Measuring reproducibility of high-throughput experiments , 2011, 1110.4705.

[29]  George Q. Daley,et al.  Lineage Regulators Direct BMP and Wnt Pathways to Cell-Specific Programs During Differentiation and Regeneration, , 2011 .

[30]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[31]  Sündüz Keles,et al.  Detecting differential binding of transcription factors with ChIP-seq , 2012, Bioinform..

[32]  Manuel Allhoff,et al.  Differential peak calling of ChIP-seq signals with replicates with THOR , 2016, Nucleic acids research.

[33]  S. Orkin,et al.  METHOD Open Access , 2014 .

[34]  Rory Stark Differential Oestrogen Receptor Binding is Associated with Clinical Outcome in Breast Cancer , 2012, RECOMB.

[35]  Qian Wang,et al.  GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data , 2012, Bioinform..

[36]  E. Mardis ChIP-seq: welcome to the new frontier , 2007, Nature Methods.

[37]  Hanfei Sun,et al.  Target analysis by integration of transcriptome and ChIP-seq data with BETA , 2013, Nature Protocols.

[38]  Andrew D. Smith,et al.  Bioinformatics Applications Note Gene Expression Identifying Dispersed Epigenomic Domains from Chip-seq Data , 2022 .

[39]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[40]  Henriette O'Geen,et al.  Discovering hematopoietic mechanisms through genome-wide analysis of GATA factor chromatin occupancy. , 2009, Molecular cell.

[41]  Shenglin Mei,et al.  Modeling cis-regulation with a compendium of genome-wide histone H3K27ac profiles , 2016, Genome research.

[42]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[43]  Mark D. Robinson,et al.  Moderated statistical tests for assessing differences in tag abundance , 2007, Bioinform..

[44]  Chen Zeng,et al.  A clustering approach for identification of enriched domains from histone modification ChIP-Seq data , 2009, Bioinform..

[45]  Tyler B. Hughes,et al.  Enhancer sequence variants and transcription-factor deregulation synergize to construct pathogenic regulatory circuits in B-cell lymphoma. , 2015, Immunity.

[46]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[47]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[48]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[49]  Dan Xie,et al.  Extensive Variation in Chromatin States Across Humans , 2013, Science.

[50]  Gergana Bounova,et al.  Quantifying ChIP-seq data: a spiking method providing an internal reference for sample-to-sample normalization , 2014 .

[51]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[52]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[53]  Uwe Ohler,et al.  JAMM: a peak finder for joint analysis of NGS replicates , 2015, Bioinform..

[54]  Feng Lin,et al.  An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data , 2008, Bioinform..

[55]  Gordon K. Smyth,et al.  Use of within-array replicate spots for assessing differential expression in microarray experiments , 2005, Bioinform..

[56]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[57]  Ivan G. Costa,et al.  Detecting differential peaks in ChIP-seq signals with ODIN , 2014, Bioinform..

[58]  Maureen A. Sartor,et al.  PePr: a peak-calling prioritization pipeline to identify consistent or differential peaks from replicated ChIP-Seq data , 2014, Bioinform..

[59]  P. Bickel,et al.  Systematic evaluation of factors influencing ChIP-seq fidelity , 2012, Nature Methods.