Inference on the Limiting False Discovery Rate and the P-value Threshold Parameter Assuming Weak Dependence between Gene Expression Levels within Subject

An objective of microarray data analysis is to identify gene expressions that are associated with a disease related outcome. For each gene, a test statistic is computed to determine if an association exists, and this statistic generates a marginal p-value. In an effort to pool this information across genes, a p-value density function is derived. The p-value density is modeled as a mixture of a uniform (0,1) density and a scaled ratio of normal densities derived from the asymptotic normality of the test statistic. The p-values are assumed to be weakly dependent and a quasi-likelihood is used to estimate the parameters in the mixture density. The quasi-likelihood and the weak dependence assumption enables estimation and asymptotic inference on the false discovery rate for a given rejection region, and its inverse, the p-value threshold parameter for a fixed false discovery rate. A false discovery rate analysis on a localized prostate cancer data set is used to illustrate the methodology. Simulations are performed to assess the performance of this methodology.

[1]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[2]  W. Gerald,et al.  Integration of gene expression profiling and clinical variables to predict prostate carcinoma recurrence after radical prostatectomy , 2005, Cancer.

[3]  A. Owen Variance of the number of false discoveries , 2005 .

[4]  R. Simon,et al.  Controlling the number of false discoveries: application to high-dimensional genomic data , 2004 .

[5]  L. Wasserman,et al.  A stochastic process approach to false discovery control , 2004, math/0406519.

[6]  M. A. Black,et al.  A note on the adaptive control of false discovery rates , 2004 .

[7]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[8]  Chen-An Tsai,et al.  Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data , 2003, Biometrics.

[9]  Wei Pan,et al.  A mixture model approach to detecting differentially expressed genes with microarray data , 2003, Functional & Integrative Genomics.

[10]  Stan Pounds,et al.  Estimating the Occurrence of False Positives and False Negatives in Microarray Studies by Approximating and Partitioning the Empirical Distribution of P-values , 2003, Bioinform..

[11]  John D. Storey A direct approach to false discovery rates , 2002 .

[12]  David B. Allison,et al.  A mixture model approach for the analysis of microarray gene expression data , 2002 .

[13]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[14]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[15]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[16]  Patrick J. Heagerty,et al.  Weighted empirical adaptive variance estimators for correlated data regression , 1999 .

[17]  R T O'Neill,et al.  The behavior of the P-value when the alternative hypothesis is true. , 1997, Biometrics.

[18]  R. Parker,et al.  Identifying important results from multiple statistical tests. , 1988, Statistics in medicine.

[19]  Robert Serfling,et al.  Contributions to Central Limit Theory for Dependent Variables , 1968 .

[20]  Xing Qiu,et al.  Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes , 2005, Statistical applications in genetics and molecular biology.

[21]  M. J. van der Laan,et al.  Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives , 2004, Statistical applications in genetics and molecular biology.

[22]  Debbie L. Hahs-Vaughn,et al.  Multiple Comparison Procedures , 2013 .

[23]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .