Properties of Balanced Permutations

This paper takes a close look at balanced permutations, a recently developed sample reuse method with applications in bioinformatics. It turns out that balanced permutation reference distributions do not have the correct null behavior, which can be traced to their lack of a group structure. We find that they can give p-values that are too permissive to varying degrees. In particular the observed test statistic can be larger than that of all B balanced permutations of a data set with a probability much higher than 1/(B + 1), even under the null hypothesis.

[1]  S. Dudoit,et al.  Multiple Testing Procedures with Applications to Genomics , 2007 .

[2]  Rainer Spang,et al.  Permutation Filtering: A Novel Concept for Significance Analysis of Large-Scale Genomic Data , 2006, RECOMB.

[3]  Wei Pan,et al.  On the Use of Permutation in and the Performance of A Class of Nonparametric Methods to Detect Differential Gene Expression , 2003, Bioinform..

[4]  Gary A. Churchill,et al.  Estimating p-values in small microarray experiments , 2007, Bioinform..

[5]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[6]  Rainer Spang,et al.  Compensating for Unknown Confounders in Microarray Data Analysis Using Filtered Permutations , 2007, J. Comput. Biol..

[7]  Jianqing Fan,et al.  Removing intensity effects and identifying significant genes for Affymetrix arrays in macrophage migration inhibitory factor-suppressed neuroblastoma cells. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Wei Pan,et al.  Gene expression A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data , 2005 .

[9]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[10]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[11]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[13]  Eugene S. Edgington,et al.  Randomization Tests , 2011, International Encyclopedia of Statistical Science.

[14]  A. Owen,et al.  AGEMAP: A Gene Expression Database for Aging in Mice , 2007, PLoS genetics.

[15]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[16]  L. J. Savage,et al.  The nonexistence of certain statistical procedures in nonparametric problems , 1956 .

[17]  S. Dudoit Multiple Testing Procedures , 2004 .

[18]  M. Gibson,et al.  Beyond ANOVA: Basics of Applied Statistics. , 1986 .

[19]  P. Tam,et al.  Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Justin O Borevitz,et al.  Genome-Wide Expression Profiling of the Arabidopsis Female Gametophyte Identifies Families of Small, Secreted Proteins , 2007, PLoS genetics.