A two-sample Bayesian t-test for microarray data

BackgroundDetermining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically.ResultsA two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance.ConclusionThe test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations.

[1]  Walter L. Smith Probability and Statistics , 1959, Nature.

[2]  Y. Chen,et al.  Ratio-based decisions and the quantitative analysis of cDNA microarray images. , 1997, Journal of biomedical optics.

[3]  G. W. Hatfield,et al.  Global gene expression profiling in Escherichia coli K12. The effects of integration host factor. , 2000, The Journal of biological chemistry.

[4]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[5]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Thomas,et al.  An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. , 2001, Genome research.

[7]  Rainer Fuchs,et al.  Bayesian Estimation of Fold-Changes in the Analysis of Gene Expression: The PFOLD Algorithm , 2001, J. Comput. Biol..

[8]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[9]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[10]  A D Long,et al.  Improved Statistical Inference from DNA Microarray Data Using Analysis of Variance and A Bayesian Statistical Framework , 2001, The Journal of Biological Chemistry.

[11]  T. Speed,et al.  Design issues for cDNA microarray experiments , 2002, Nature Reviews Genetics.

[12]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[13]  D. Hartl,et al.  Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple strains or treatments , 2002, Genome Biology.

[14]  Wei Zhang,et al.  Obtaining reliable information from minute amounts of RNA using cDNA microarrays , 2002, BMC Genomics.

[15]  R. Nadon,et al.  Statistical issues with microarrays: processing and analysis. , 2002, Trends in genetics : TIG.

[16]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[17]  S. Henikoff,et al.  Genome-Wide Profiling of DNA Methylation Reveals Transposon Targets of CHROMOMETHYLASE3 , 2002, Current Biology.

[18]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[19]  William Stafford Noble,et al.  The effect of replication on gene expression microarray experiments , 2003, Bioinform..

[20]  B. Edgar,et al.  Genomic binding by the Drosophila Myc, Max, Mad/Mnt transcription factor network. , 2003, Genes & development.

[21]  A. Brivanlou,et al.  Molecular signature of human embryonic stem cells and its comparison with the mouse. , 2003, Developmental biology.

[22]  G. W. Hatfield,et al.  Global Gene Expression Profiling in Escherichia coli K12 , 2003, Journal of Biological Chemistry.

[23]  J. Townsend,et al.  BMC Genomics BioMed Central Methodology article , 2003 .

[24]  X. Gu,et al.  Induced gene expression in human brain after the split from chimpanzee. , 2003, Trends in genetics : TIG.

[25]  Jae K. Lee,et al.  Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays , 2003, Bioinform..

[26]  R. Gottardo,et al.  Statistical analysis of microarray data: a Bayesian approach. , 2003, Biostatistics.

[27]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[28]  Jeffrey P. Townsend,et al.  Resolution of large and small differences in gene expression using models for the Bayesian analysis of gene expression levels and spotted DNA microarrays , 2004, BMC Bioinformatics.

[29]  Jean-Jacques Daudin,et al.  VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data , 2005, Bioinform..

[30]  Xiaohui Liu,et al.  An experimental evaluation of a loop versus a reference design for two-channel microarrays , 2005, Bioinform..

[31]  W. Johnson,et al.  The Bayesian Two-Sample t Test , 2005 .

[32]  Yudi Pawitan,et al.  False discovery rate, sensitivity and sample size for microarray studies , 2005, Bioinform..