Sample size calculation for a regularized t-statistic in microarray experiments

The regularized t-statistic, wherein, a regularization term is added to the denominator in order to improve the identifiability of differentially expressed genes, outperforms an ordinary t-statistic. Most methods of sample size calculation in the literature, however, calculate the sample size based on the ordinary one. We derive an approximate formula for the distribution of regularized t-statistics and develop a formula of sample size calculation for regularized t-statistics. The sample size is determined based on sensitivity under certain conditions to maintain a certain false discovery rate using a mixture model. The usefulness of the proposed method is demonstrated by numerical studies that compare the sample sizes of regularized and ordinary t-statistics and simulation studies for examining the robustness of the proposed method based on real data.

[1]  Stuart M. Brown,et al.  Selection and validation of differentially expressed genes in head and neck cancer , 2004, Cellular and Molecular Life Sciences CMLS.

[2]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Shuying S Li,et al.  FDR‐controlling testing procedures and sample size determination for microarrays , 2005, Statistics in medicine.

[4]  D. B. Allison,et al.  Microarray profiling of isolated abdominal subcutaneous adipocytes from obese vs non-obese Pima Indians: increased expression of inflammation-related genes , 2005, Diabetologia.

[5]  Robert Tibshirani,et al.  A simple method for assessing sample sizes in microarray experiments , 2006, BMC Bioinformatics.

[6]  Yudi Pawitan,et al.  False discovery rate, sensitivity and sample size for microarray studies , 2005, Bioinform..

[7]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[8]  G. Garcı́a-Cardeña,et al.  Improving the statistical detection of regulated genes from microarray data using intensity-based variance estimation , 2004, BMC Genomics.

[9]  R. Reiss Approximate Distributions of Order Statistics , 1989 .

[10]  X. Cui,et al.  Improved statistical tests for differential gene expression by shrinking variance components estimates. , 2005, Biostatistics.

[11]  Hongyu Zhao,et al.  Practical guidelines for assessing power and false discovery rate for a fixed sample size in microarray experiments , 2008, Statistics in medicine.

[12]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[13]  Chi-Hong Tseng,et al.  Sample size calculation with dependence adjustment for FDR-control in microarray studies. , 2007, Statistics in medicine.

[14]  D. Hinkley On the ratio of two correlated normal random variables , 1969 .

[15]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[16]  F. Wright,et al.  Assessing Differential Gene Expression with Small Sample Sizes in Oligonucleotide Arrays Using a Mean‐Variance Model , 2007, Biometrics.