Bayesian Modeling of Differential Gene Expression

Summary We present a Bayesian hierarchical model for detecting differentially expressing genes that includes simultaneous estimation of array effects, and show how to use the output for choosing lists of genes for further investigation. We give empirical evidence that expression‐level dependent array effects are needed, and explore different nonlinear functions as part of our model‐based approach to normalization. The model includes gene‐specific variances but imposes some necessary shrinkage through a hierarchical structure. Model criticism via posterior predictive checks is discussed. Modeling the array effects (normalization) simultaneously with differential expression gives fewer false positive results. To choose a list of genes, we propose to combine various criteria (for instance, fold change and overall expression) into a single indicator variable for each gene. The posterior distribution of these variables is used to pick the list of genes, thereby taking into account uncertainty in parameter estimates. In an application to mouse knockout data, Gene Ontology annotations over‐ and underrepresented among the genes on the chosen list are consistent with biological expectations.

[1]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[4]  Y. Chen,et al.  Ratio-based decisions and the quantitative analysis of cDNA microarray images. , 1997, Journal of biomedical optics.

[5]  R. Silverstein,et al.  A Null Mutation in Murine CD36 Reveals an Important Role in Fatty Acid and Lipoprotein Metabolism* , 1999, The Journal of Biological Chemistry.

[6]  James Scott,et al.  Identification of Cd36 (Fat) as an insulin-resistance gene causing defective fatty acid and glucose metabolism in hypertensive rats , 1999, Nature Genetics.

[7]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[8]  C. Li,et al.  Analyzing high‐density oligonucleotide gene expression array data , 2001, Journal of cellular biochemistry.

[9]  M. J. Bayarri,et al.  P Values for Composite Null Models , 2000 .

[10]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[11]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[13]  T. Kepler,et al.  Normalization and analysis of DNA microarray data by self-consistency and local regression , 2002, Genome Biology.

[14]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[15]  Raymond J Carroll,et al.  DNA Microarray Experiments: Biological and Technological Aspects , 2002, Biometrics.

[16]  John D. Storey A direct approach to false discovery rates , 2002 .

[17]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[18]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[19]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[20]  J. S. Rao,et al.  Detecting Differentially Expressed Genes in Microarrays Using Bayesian Model Selection , 2003 .

[21]  D J Spiegelhalter,et al.  Approximate cross‐validatory predictive checks in disease mapping models , 2003, Statistics in medicine.

[22]  Colin C. Pritchard,et al.  Bayesian integrated functional analysis of microarray data , 2004, Bioinform..

[23]  Joaquín Dopazo,et al.  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes , 2004, Bioinform..

[24]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[25]  David R. Bickel,et al.  Degrees of differential gene expression: detecting biologically significant expression differences and estimating their magnitudes , 2004, Bioinform..

[26]  K. Clément,et al.  The FASEB Journal • Research Communication Weight loss regulates inflammation-related genes in white adipose tissue of obese subjects , 2022 .

[27]  Anne-Mette K. Hein,et al.  BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. , 2005, Biostatistics.