Robust estimation of the false discovery rate

MOTIVATION Presently available methods that use p-values to estimate or control the false discovery rate (FDR) implicitly assume that p-values are continuously distributed and based on two-sided tests. Therefore, it is difficult to reliably estimate the FDR when p-values are discrete or based on one-sided tests. RESULTS A simple and robust method to estimate the FDR is proposed. The proposed method does not rely on implicit assumptions that tests are two-sided or yield continuously distributed p-values. The proposed method is proven to be conservative and have desirable large-sample properties. In addition, the proposed method was among the best performers across a series of 'real data simulations' comparing the performance of five currently available methods. AVAILABILITY Libraries of S-plus and R routines to implement the method are freely available from www.stjuderesearch.org/depts/biostats.

[1]  Cheng Cheng,et al.  Statistical Development and Evaluation of Microarray Gene Expression Data Filters , 2005, J. Comput. Biol..

[2]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[3]  Stan Pounds,et al.  Estimation and control of multiple testing error rates for microarray studies , 2006, Briefings Bioinform..

[4]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[5]  Stan Pounds,et al.  Estimating the Occurrence of False Positives and False Negatives in Microarray Studies by Approximating and Partitioning the Empirical Distribution of P-values , 2003, Bioinform..

[6]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Peter J. Park,et al.  Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data , 2005, Bioinform..

[8]  Cheng Cheng,et al.  Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data , 2004, Statistical applications in genetics and molecular biology.

[9]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[10]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  John D. Storey A direct approach to false discovery rates , 2002 .

[12]  R. Iman,et al.  The Use of the Rank Transform in Regression , 1979 .

[13]  David B. Allison,et al.  A mixture model approach for the analysis of microarray gene expression data , 2002 .

[14]  Weichung Joe Shih,et al.  A mixture model for estimating the local false discovery rate in DNA microarray analysis , 2004, Bioinform..

[15]  Tarone Re A modified Bonferroni method for discrete data. , 1990 .

[16]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[17]  David B. Allison,et al.  Randomization tests for small samples: an application for genetic expression data , 2003 .

[18]  Cheng Cheng,et al.  Sample size determination for the false discovery rate , 2005, Bioinform..

[19]  Chen-An Tsai,et al.  Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data , 2003, Biometrics.

[20]  Cheng Li,et al.  dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data , 2004, Bioinform..

[21]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[22]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[23]  Cheng Cheng,et al.  Improving false discovery rate estimation , 2004, Bioinform..

[24]  J. Downing,et al.  Gene Expression Profiling of Pediatric Acute Myelogenous Leukemia Materials and Methods , 2022 .

[25]  Peter B. Gilbert,et al.  A modified false discovery rate multiple‐comparisons procedure for discrete data, applied to human immunodeficiency virus genetics , 2005 .

[26]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .