Modeling epigenetic modifications under multiple treatment conditions

ChIP-chip is a powerful tool for epigenetic research. However, current statistical methods are developed primarily for detecting transcription factor binding sites, and there is currently no satisfactory method for incorporating covariates such as time, hormone levels, and genotypes. In this study, we develop a varying coefficient model for epigenetic modifications such as histone acetylation and DNA methylation. By taking into account the special features of ChIP-chip data, a plug-in type method is derived for bandwidth selection in the local linear fitting of the varying coefficient model. Our results show that analyses using the proposed varying coefficient model can effectively detect diverse characteristics of epigenetic modifications over genomic regions as well as across different treatment conditions.

[1]  Kevin Struhl,et al.  Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments , 2006, BMC Bioinformatics.

[2]  Jianqing Fan,et al.  Statistical Estimation in Varying-Coefficient Models , 1999 .

[3]  H. Zou,et al.  Local CQR Smoothing: An Efficient and Safe Alternative to Local Polynomial Regression. , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[4]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[5]  Regina Y. Liu Moving blocks jackknife and bootstrap capture weak dependence , 1992 .

[6]  Clifford A. Meyer,et al.  Model-based analysis of tiling-arrays for ChIP-chip , 2006, Proceedings of the National Academy of Sciences.

[7]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[8]  Runze Li,et al.  Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression , 2010 .

[9]  Jianqing Fan,et al.  Functional-Coefficient Regression Models for Nonlinear Time Series , 2000 .

[10]  O. Linton Local Regression Models , 2010 .

[11]  Wing Hung Wong,et al.  TileMap: create chromosomal map of tiling array hybridizations , 2005, Bioinform..

[12]  W. Lam,et al.  Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells , 2005, Nature Genetics.

[13]  Clifford A. Meyer,et al.  A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences , 2005, ISMB.

[14]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[15]  H. Müller,et al.  Local Polynomial Modeling and Its Applications , 1998 .

[16]  Sandrine Dudoit,et al.  Multiple Testing Methods For ChIP - Chip High Density Oligonucleotide Array Data , 2006, J. Comput. Biol..

[17]  David Ruppert,et al.  Local polynomial variance-function estimation , 1997 .

[18]  Leah Barrera,et al.  ChIP‐chip: Data, Model, and Analysis , 2007, Biometrics.

[19]  D. Giles,et al.  Computer-aided econometrics , 2003 .

[20]  Zhijie Xiao,et al.  A Nonparametric Regression Estimator that Adapts to Error Distribution of Unknown Form , 2001 .

[21]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[22]  Sündüz Keleş,et al.  Mixture Modeling for Genome‐Wide Localization of Transcription Factors , 2007, Biometrics.

[23]  Dustin E. Schones,et al.  Genome-wide approaches to studying chromatin modifications , 2008, Nature Reviews Genetics.

[24]  James Stephen Marron,et al.  Fast and simple scatterplot smoothing , 1995 .

[25]  Jianqing Fan,et al.  Efficient Estimation of Conditional Variance Functions in Stochastic Regression , 1998 .

[26]  Raphael Gottardo,et al.  A Flexible and Powerful Bayesian Hierarchical Model for ChIP–Chip Experiments , 2008, Biometrics.

[27]  Christoph Plass,et al.  ChIP-chip comes of age for genome-wide functional analysis. , 2006, Cancer research.

[28]  Zongwu Cai,et al.  Application of a local linear autoregressive model to BOD time series , 2000 .

[29]  P. Mielke,et al.  Permutation Methods: A Distance Function Approach , 2007 .

[30]  Jianqing Fan,et al.  Robust Non-parametric Function Estimation , 1994 .

[31]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[32]  Jun S. Song,et al.  High-throughput mapping of the chromatin structure of human promoters , 2007, Nature Biotechnology.

[33]  M. Wand,et al.  An Effective Bandwidth Selector for Local Least Squares Regression , 1995 .

[34]  Runze Li,et al.  Quadratic Inference Functions for Varying‐Coefficient Models with Longitudinal Data , 2006, Biometrics.

[35]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[36]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[37]  R. Tibshirani,et al.  Varying‐Coefficient Models , 1993 .

[38]  Jianqing Fan,et al.  Efficient Estimation and Inferences for Varying-Coefficient Models , 2000 .