Bayesian functional data analysis over dependent regions and its application for identification of differentially methylated regions

We consider a Bayesian functional data analysis for observations measured as extremely long sequences. Splitting the sequence into a number of small windows with manageable length, the windows may not be independent especially when they are neighboring to each other. We propose to utilize Bayesian smoothing splines to estimate individual functional patterns within each window and to establish transition models for parameters involved in each window to address the dependent structure between windows. The functional difference of groups of individuals at each window can be evaluated by Bayes Factor based on Markov Chain Monte Carlo samples in the analysis. In this paper, we examine the proposed method through simulation studies and apply it to identify differentially methylated genetic regions in TCGA lung adenocarcinoma data.

[1]  Christian Gieger,et al.  A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization , 2006, Nature Genetics.

[2]  Jeffrey T Leek,et al.  Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. , 2012, International journal of epidemiology.

[3]  F. Liang Dynamically Weighted Importance Sampling in Monte Carlo Computation , 2002 .

[4]  Faming Liang,et al.  Sea Surface Temperature Modeling using Radial Basis Function Networks With a Dynamically Weighted Particle Filter , 2013 .

[5]  B. Mallick,et al.  Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements , 2011, Biometrics.

[6]  A. Shilatifard,et al.  An operational definition of epigenetics. , 2009, Genes & development.

[7]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[8]  Jordi Remon,et al.  LKB1/STK11 mutations in non-small cell lung cancer patients: Descriptive analysis and prognostic value. , 2017, Lung cancer.

[9]  Dan Wang,et al.  IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data , 2012, Bioinform..

[10]  A. Kneip,et al.  Functional Data Analysis and Mixed Effect Models , 2004 .

[11]  Mark D. Robinson,et al.  Statistical methods for detecting differentially methylated loci and regions , 2014, Front. Genet..

[12]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[13]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[14]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[15]  Vessela Kristensen,et al.  Genome‐wide DNA methylation analyses in lung adenocarcinomas: Association with EGFR, KRAS and TP53 mutation status, gene expression and prognosis , 2015, Molecular oncology.

[16]  K. Robertson,et al.  DNA methylation in development and human disease. , 2008, Mutation research.

[17]  Rebecca M. Warner,et al.  Spectral Analysis of Time-Series Data , 1998 .

[18]  Peter C. M. Molenaar,et al.  A dynamic factor model for the analysis of multivariate time series , 1985 .

[19]  Richard T. Barfield,et al.  CpGassoc: an R function for analysis of DNA methylation microarray data , 2012, Bioinform..

[20]  S. Tinschert,et al.  Differential MSH2 promoter methylation in blood cells of Neurofibromatosis type 1 (NF1) patients , 2010, European Journal of Human Genetics.

[21]  Hua Yu,et al.  COHCAP: an integrative genomic pipeline for single-nucleotide resolution DNA methylation analysis , 2013, Nucleic acids research.

[22]  J. Rogers,et al.  DNA methylation profiling of human chromosomes 6, 20 and 22 , 2006, Nature Genetics.

[23]  J. Li,et al.  Relationship of EGFR DNA methylation with the severity of non-small cell lung cancer. , 2015, Genetics and molecular research : GMR.

[24]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[25]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[26]  Li Yu,et al.  [DNA methylation and cancer]. , 2005, Zhonghua nei ke za zhi.

[27]  Saad T. Bakir,et al.  Nonparametric Regression and Spline Smoothing , 2000, Technometrics.

[28]  Rafael A. Irizarry,et al.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays , 2014, Bioinform..

[29]  S. West,et al.  Statistical issues in the study of temporal data: daily experiences. , 1991, Journal of personality.

[30]  K. Chin,et al.  Frequent Silencing of Low Density Lipoprotein Receptor-Related Protein 1B (LRP1B) Expression by Genetic and Epigenetic Mechanisms in Esophageal Squamous Cell Carcinoma , 2004, Cancer Research.

[31]  S. Boker,et al.  Windowed cross-correlation and peak picking for the analysis of variability in the association between behavioral time series. , 2002, Psychological methods.

[32]  M. Esteller,et al.  Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome , 2011, Epigenetics.

[33]  Xiaoling Wang,et al.  Differential methylation tests of regulatory regions , 2016, Statistical applications in genetics and molecular biology.

[34]  P. Speckman,et al.  Priors for Bayesian adaptive spline smoothing , 2012 .

[35]  A. Feinberg,et al.  The history of cancer epigenetics , 2004, Nature Reviews Cancer.