A simple procedure for estimating the false discovery rate

MOTIVATION The most used criterion in microarray data analysis is nowadays the false discovery rate (FDR). In the framework of estimating procedures based on the marginal distribution of the P-values without any assumption on gene expression changes, estimators of the FDR are necessarily conservatively biased. Indeed, only an upper bound estimate can be obtained for the key quantity pi0, which is the probability for a gene to be unmodified. In this paper, we propose a novel family of estimators for pi0 that allows the calculation of FDR. RESULTS The very simple method for estimating pi0 called LBE (Location Based Estimator) is presented together with results on its variability. Simulation results indicate that the proposed estimator performs well in finite sample and has the best mean square error in most of the cases as compared with the procedures QVALUE, BUM and SPLOSH. The different procedures are then applied to real datasets. AVAILABILITY The R function LBE is available at http://ifr69.vjf.inserm.fr/lbe CONTACT broet@vjf.inserm.fr.

[1]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[2]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[3]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[5]  John D. Storey A direct approach to false discovery rates , 2002 .

[6]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[7]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[8]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[9]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[10]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .

[11]  Cheng Cheng,et al.  Improving false discovery rate estimation , 2004, Bioinform..

[12]  M. Wand Local Regression and Likelihood , 2001 .

[13]  Stan Pounds,et al.  Estimating the Occurrence of False Positives and False Negatives in Microarray Studies by Approximating and Partitioning the Empirical Distribution of P-values , 2003, Bioinform..

[14]  S. Dudoit,et al.  Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. , 2000, Genome research.

[15]  Alex Lewin,et al.  A mixture model-based strategy for selecting sets of genes in multiclass response microarray experiments , 2004, Bioinform..

[16]  Mark Schena,et al.  Microarray Biochip Technology , 2000 .

[17]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.