Summary and discussion of : “ Controlling the False Discovery Rate : A Practical and Powerful Approach to Multiple Testing

In hypothesis testing, the multiplicity problem occurs when performing a large number of hypotheses tests simultaneously. With moderately sized data sets, it might be possible to gloss over this issue, yet in an era increasingly characterized by massive data sets, this is no longer possible. In genetics, DNA microarray experiments are used to gain a better understanding of the causes and effects of diseases by investigating changes in gene expression for thousands of genes. In such a microarray experiment, one might easily perform 10, 000 tests. Using a standard p-value of 0.05, one would expect 500 genes to be deemed significant by chance even if there was no effect at all. To correct hypothesis testing procedures under these circumstances, several methods have been proposed, some based on the concept of the False Discovery Rate (FDR), which forms the main object of interest for the rest of this summary paper. We start our discussion by introducing necessary notation and quickly reviewing the classical approaches to deal with the multiplicity problem. Then, the definition of FDR is introduced and the original method outlined in1 is explained. Finally, simulation results are presented and points brought up in the in-class discussion are summarized.