A Comparison of Procedures for Controlling the False Discovery Rate in the Presence of Small Variance Genes: A Simulation Study

The Significance Analysis of Microarrays (SAM; Tusher et al., 2001) method is widely used in analyzing gene expression data while controlling the FDR by using resampling-based procedure in the microarray setting. One of the main components of the SAM procedure is the adjustment of the test statistic. The introduction of the fudge factor to the test statistic aims at deflating the large value of test statistics due to the small standard error of gene-expression. Lin et al. (2008) pointed out that the fudge factor does not effectively improve the power and the control of the FDR as compared to the SAM procedure without the fudge factor in the presence of small variance genes. Motivated by the simulation results presented in Lin et al. (2008), in this article, we extend our study to compare several methods for choosing the fudge factor in the modified t-type test statistics and use simulation studies to investigate the power and the control of the FDR of the considered methods.

[1]  Jean-Jacques Daudin,et al.  Mixture model on the variance for the differential analysis of gene expression data , 2005 .

[2]  Robert Tibshirani,et al.  SAM “Significance Analysis of Microarrays” Users guide and technical document , 2002 .

[3]  P. Broberg Statistical methods for ranking differentially expressed genes , 2003, Genome Biology.

[4]  Mark S. Gilthorpe,et al.  A full Bayesian hierarchical mixture model for the variance of gene differential expression , 2007, BMC Bioinformatics.

[5]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[6]  John D. Storey A direct approach to false discovery rates , 2002 .

[7]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[8]  R. Tibshirani,et al.  Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.

[9]  Scott L. Zeger,et al.  The Analysis of Gene Expression Data: An Overview of Methods and Software , 2003 .

[10]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[11]  Hinrich W. H. Göhlmann,et al.  An Investigation on Performance of Significance Analysis of Microarray (SAM) for the Comparisons of Several Treatments with one Control in the Presence of Small‐variance Genes , 2008, Biometrical journal. Biometrische Zeitschrift.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  G. Parmigiani,et al.  The Analysis of Gene Expression Data , 2003 .

[15]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[16]  Thomas D. Wu,et al.  Analysing gene expression data from DNA microarrays to identify candidate genes , 2001, The Journal of pathology.

[17]  M. Niranjan,et al.  Ranking the Eeect of Diierent Features on the Classiication of Discrete Valued Data , 2007 .