A Mixture-Model Approach for Parallel Testing for Unequal Variances

Testing for unequal variances is usually performed in order to check the validity of the assumptions that underlie standard tests for differences between means (the t-test and anova). However, existing methods for testing for unequal variances (Levene's test and Bartlett's test) are notoriously non-robust to normality assumptions, especially for small sample sizes. Moreover, although these methods were designed to deal with one hypothesis at a time, modern applications (such as to microarrays and fMRI experiments) often involve parallel testing over a large number of levels (genes or voxels). Moreover, in these settings a shift in variance may be biologically relevant, perhaps even more so than a change in the mean. This paper proposes a parsimonious model for parallel testing of the equal variance hypothesis. It is designed to work well when the number of tests is large; typically much larger than the sample sizes. The tests are implemented using an empirical Bayes estimation procedure which `borrows information' across levels. The method is shown to be quite robust to deviations from normality, and to substantially increase the power to detect differences in variance over the more traditional approaches even when the normality assumption is valid.

[1]  Dennis D. Boos,et al.  Bootstrap Methods for Testing Homogeneity of Variances , 1989 .

[2]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[3]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[4]  G. Glover,et al.  A Genetic Variant BDNF Polymorphism Alters Extinction Learning in Both Mouse and Human , 2010, Science.

[5]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[6]  N. Friedman,et al.  Stochastic protein expression in individual cells at the single molecule level , 2006, Nature.

[7]  A. Aderem,et al.  Probability of Individual Inducible Genes from Gene-Autonomous Transcriptional System : Macrophage Heterogeneity Arises Generation of Diversity in the Innate Immune , 2001 .

[8]  A. Feinberg,et al.  Stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease , 2010, Proceedings of the National Academy of Sciences.

[9]  R. Deberardinis,et al.  Beyond aerobic glycolysis: Transformed cells can engage in glutamine metabolism that exceeds the requirement for protein and nucleotide synthesis , 2007, Proceedings of the National Academy of Sciences.

[10]  Peng Liu,et al.  Optimal Tests Shrinking Both Means and Variances Applicable to Microarray Data Analysis , 2010, Statistical applications in genetics and molecular biology.

[11]  Yi Liu,et al.  Single-Cell Gene Expression Profiling , 2022 .

[12]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[13]  B. Efron,et al.  Data Analysis Using Stein's Estimator and its Generalizations , 1975 .

[14]  David J. Spiegelhalter,et al.  Microarrays, Empirical Bayes and the Two-Groups Model. Comment. , 2008 .

[15]  C. Pesce,et al.  Regulated cell-to-cell variation in a cell-fate decision system , 2005, Nature.

[16]  John Quackenbush,et al.  Inferring steady state single-cell gene expression distributions from analysis of mesoscopic samples , 2006, Genome Biology.

[17]  Dennis D. Boos,et al.  Comparing variances and other measures of dispersion , 2004 .

[18]  David A. Hume,et al.  Generation of Diversity in the Innate Immune System: Macrophage Heterogeneity Arises from Gene-Autonomous Transcriptional Probability of Individual Inducible Genes1 , 2002, The Journal of Immunology.

[19]  J. Gastwirth,et al.  The impact of Levene’s test of equality of variances on statistical theory and practice , 2009, 1010.0308.

[20]  John Quackenbush,et al.  Variance of Gene Expression Identifies Altered Network Constraints in Neurological Disease , 2011, PLoS genetics.

[21]  Michael A. Charleston,et al.  Differential variability analysis of gene expression and its application to human diseases , 2008, ISMB.

[22]  Ertugrul M. Ozbudak,et al.  Regulation of noise in the expression of a single gene , 2002, Nature Genetics.

[23]  H. Levene Robust tests for equality of variances , 1961 .

[24]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[25]  John Quackenbush,et al.  Why Is There a Lack of Consensus on Molecular Subgroups of Glioblastoma? Understanding the Nature of Biological and Statistical Variability in Glioblastoma Expression Data , 2011, PloS one.

[26]  Martin T. Wells,et al.  Laplace Approximated EM Microarray Analysis: An Empirical Bayes Approach for Comparative Microarray Experiments , 2010, 1101.0905.

[27]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[28]  G. Box NON-NORMALITY AND TESTS ON VARIANCES , 1953 .

[29]  S. Dudoit,et al.  Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. , 2000, Genome research.

[30]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .