Optimal Shrinkage Estimation of Variances With Applications to Microarray Data Analysis

Microarray technology allows a scientist to study genomewide patterns of gene expression. Thousands of individual genes are measured with a relatively small number of replications, which poses challenges to traditional statistical methods. In particular, the gene-specific estimators of variances are not reliable and gene-by-gene tests have low powers. In this article we propose a family of shrinkage estimators for variances raised to a fixed power. We derive optimal shrinkage parameters under both Stein and squared loss functions. Our results show that the standard sample variance is inadmissible under either loss function. We propose several estimators for the optimal shrinkage parameters and investigate their asymptotic properties under two scenarios: large number of replications and large number of genes. We conduct simulations to evaluate the finite sample performance of the data-driven optimal shrinkage estimators and compare them with some existing methods. We construct F-like statistics using these shrinkage variance estimators and apply them to detect differentially expressed genes in a microarray experiment. We also conduct simulations to evaluate performance of these F-like statistics and compare them with some existing methods.

[1]  C. Stein Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean , 1964 .

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[4]  Pierre R. Bushel,et al.  STATISTICAL ANALYSIS OF A GENE EXPRESSION MICROARRAY EXPERIMENT WITH REPLICATION , 2002 .

[5]  Karl J. Friston,et al.  Variance Components , 2003 .

[6]  Lawrence D. Brown,et al.  INADMISSIBILITY OF THE USUAL ESTIMATORS OF SCALE PARAMETERS IN PROBLEMS WITH UNKNOWN LOCATION AND SCALE PARAMETERS , 1968 .

[7]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[8]  T. Ferguson A Course in Large Sample Theory , 1996 .

[9]  D. Cavalieri,et al.  Fundamentals of cDNA microarray data analysis. , 2003, Trends in genetics : TIG.

[10]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[11]  J. F. Brewster,et al.  Improving on Equivariant Estimators , 1974 .

[12]  Raymond J Carroll,et al.  DNA Microarray Experiments: Biological and Technological Aspects , 2002, Biometrics.

[13]  John D. Storey,et al.  SAM Thresholding and False Discovery Rates for Detecting Differential Gene Expression in DNA Microarrays , 2003 .

[14]  R. Littell SAS System for Mixed Models , 1996 .

[15]  Hao Wu,et al.  MAANOVA: A Software Package for the Analysis of Spotted cDNA Microarray Experiments , 2003 .

[16]  George Casella,et al.  Developments in Decision-Theoretic Variance Estimation , 1990 .

[17]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[18]  Jae K. Lee,et al.  Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays , 2003, Bioinform..

[19]  Alexander Kamb,et al.  A simple method for statistical analysis of intensity differences in microarray-derived gene expression data , 2001, BMC biotechnology.

[20]  Richard Simon,et al.  A random variance model for detection of differential gene expression in small microarray experiments , 2003, Bioinform..

[21]  M. Ghosh,et al.  INADMISSIBILITY OF THE BEST EQUIVARIANT ESTIMATORS OF THE VARIANCE-COVARIANCE MATRIX, THE PRECISION MATRIX, AND THE GENERALIZED VARIANCE UNDER ENTROPY LOSS , 1987 .

[22]  François Perron Equivariant estimators of the covariance matrix , 1990 .

[23]  X. Cui,et al.  Improved statistical tests for differential gene expression by shrinking variance components estimates. , 2005, Biostatistics.

[24]  T. Kubokawa A Unified Approach to Improving Equivariant Estimators , 1994 .

[25]  Tatsuya Kubokawa,et al.  Estimating the covariance matrix: a new approach , 2003 .

[26]  Tatsuya Kubokawa,et al.  Shrinkage and modification techniques in estimation of variance and the related problems : A review , 1998 .

[27]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[28]  Olivier Ledoit,et al.  Honey, I Shrunk the Sample Covariance Matrix , 2003 .

[29]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[30]  C M Kendziorski,et al.  On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles , 2003, Statistics in medicine.

[31]  Brian S. Yandell,et al.  Adaptive Gene Picking with Microarray Data: Detecting Important Low Abundance Signals , 2003 .

[32]  S. Dudoit,et al.  Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. , 2000, Genome research.