Laplace Approximated EM Microarray Analysis: An Empirical Bayes Approach for Comparative Microarray Experiments

A two-groups mixed-effects model for the comparison of (normalized) microarray data from two treatment groups is considered. Most competing parametric methods that have appeared in the literature are obtained as special cases or by minor modification of the proposed model. Approximate maximum likelihood fitting is accomplished via a fast and scalable algorithm, which we call LEMMA (Laplace approximated EM Microarray Analysis). The posterior odds of treatment $\times$ gene interactions, derived from the model, involve shrinkage estimates of both the interactions and of the gene specific error variances. Genes are classified as being associated with treatment based on the posterior odds and the local false discovery rate (f.d.r.) with a fixed cutoff. Our model-based approach also allows one to declare the non-null status of a gene by controlling the false discovery rate (FDR). It is shown in a detailed simulation study that the approach outperforms well-known competitors. We also apply the proposed methodology to two previously analyzed microarray examples. Extensions of the proposed method to paired treatments and multiple treatments are also discussed.

[1]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Peng Liu,et al.  Optimal Tests Shrinking Both Means and Variances Applicable to Microarray Data Analysis , 2010, Statistical applications in genetics and molecular biology.

[4]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[5]  Kenny Q. Ye,et al.  An Integrative Genomic and Epigenomic Approach for the Study of Transcriptional Regulation , 2008, PloS one.

[6]  R. Butler SADDLEPOINT APPROXIMATIONS WITH APPLICATIONS. , 2007 .

[7]  T. Speed,et al.  A multivariate empirical Bayes statistic for replicated microarray time course data , 2006, math/0702685.

[8]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[9]  Peter Nilsson,et al.  Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels , 2005, Statistical applications in genetics and molecular biology.

[10]  S. Dudoit,et al.  Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. , 2000, Genome research.

[11]  X. Cui,et al.  Improved statistical tests for differential gene expression by shrinking variance components estimates. , 2005, Biostatistics.

[12]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[13]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M. Wells,et al.  GENERALIZED THRESHOLDING ESTIMATORS FOR HIGH-DIMENSIONAL LOCATION PARAMETERS , 2010 .

[15]  Terence P. Speed,et al.  On Gene Ranking Using Replicated Microarray Time Course Data , 2009, Biometrics.

[16]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[17]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[18]  Richard Simon,et al.  A random variance model for detection of differential gene expression in small microarray experiments , 2003, Bioinform..

[19]  C M Kendziorski,et al.  On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles , 2003, Statistics in medicine.

[20]  B. Lindqvist,et al.  Estimating the proportion of true null hypotheses, with application to DNA microarray data , 2005 .

[21]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[23]  P. Müller,et al.  A Bayesian mixture model for differential gene expression , 2005 .

[24]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[25]  Vladimir Pavlovic,et al.  RankGene: identification of diagnostic genes based on expression data , 2003, Bioinform..

[26]  Chong Sun Hong,et al.  Optimal Threshold from ROC and CAP Curves , 2009, Commun. Stat. Simul. Comput..

[27]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[28]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[29]  Yoav Benjamini,et al.  Microarrays, Empirical Bayes and the Two-Groups Model. Comment. , 2008 .

[30]  Bradley Efron,et al.  Local False Discovery Rates , 2005 .