On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles

DNA microarrays provide for unprecedented large-scale views of gene expression and, as a result, have emerged as a fundamental measurement tool in the study of diverse biological systems. Statistical questions abound, but many traditional data analytic approaches do not apply, in large part because thousands of individual genes are measured with relatively little replication. Empirical Bayes methods provide a natural approach to microarray data analysis because they can significantly reduce the dimensionality of an inference problem while compensating for relatively few replicates by using information across the array. We propose a general empirical Bayes modelling approach which allows for replicate expression profiles in multiple conditions. The hierarchical mixture model accounts for differences among genes in their average expression levels, differential expression for a given gene among cell types, and measurement fluctuations. Two distinct parameterizations are considered: a model based on Gamma distributed measurements and one based on log-normally distributed measurements. False discovery rate and related operating characteristics of the methodology are assessed in a simulation study. We also show how the posterior odds of differential expression in one version of the model is related to the ratio of the arithmetic mean to the geometric mean of the two sample means. The methodology is used in a study of mammary cancer in the rat, where four distinct patterns of expression are possible.

[1]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[2]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[3]  B. Efron,et al.  Stein's Paradox in Statistics , 1977 .

[4]  M A Newton,et al.  Genetic identification of multiple loci that control breast cancer susceptibility in the rat. , 1998, Genetics.

[5]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  P. Hartge,et al.  The risk of cancer associated with specific mutations of BRCA1 and BRCA2 among Ashkenazi Jews. , 1997, The New England journal of medicine.

[7]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .

[8]  B. Efron,et al.  Combining Possibly Related Estimation Problems , 1973 .

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  Robert Tibshirani,et al.  Microarrays and Their Use in a Comparative Experiment , 2000 .

[11]  Mike West,et al.  Bayesian Regression Analysis in the "Large p, Small n" Paradigm with Application in DNA Microarray S , 2000 .

[12]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[13]  G A Whitmore,et al.  Models for microarray gene expression data , 2002, Journal of biopharmaceutical statistics.

[14]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[15]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Y. Chen,et al.  Ratio-based decisions and the quantitative analysis of cDNA microarray images. , 1997, Journal of biomedical optics.

[17]  M. Newton On a nonparametric recursive estimator of the mixing distribution , 2002 .

[18]  C. Kendziorski,et al.  The efficiency of pooling mRNA in microarray experiments. , 2003, Biostatistics.

[19]  Rainer Spang,et al.  DNA Microarray Data Analysis and Regression Modeling for Genetic Expression Profiling , 2000 .

[20]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. D. Thompson,et al.  BRCA1 mutations in a population-based sample of young women with breast cancer. , 1996, The New England journal of medicine.

[22]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..