In current practice, such as GWAS (genome-wide association studies), permutation is often applied to multiple testing for association between large number of features [e.g. single nucleotide polymorphisms (SNPs)] and phenotypes (Hahn et al., 2008). Inferring that there is a difference between the phenotypic groups X and Y in some of the features is not very useful. One has to know for which features there is a difference. Exchangeability, a necessary condition for the validity of permutation tests, might be applicable if subjects are assigned randomly to treatments and the treatment is totally innocuous. However, instead of randomized, controlled clinical trials, bioinformatics discovery studies are mostly retrospective. Huang et al. (2006) gives examples of how permutation testing may fail to control Type I error when exchangeability does not hold. Equally important, Theorem 2.2 of this paper gives a succinct condition on when permutation testing is valid, even when exchangeability fails. This condition is as follows. In testing the null hypothesis that there is no difference in an entire set of features between groups X and Y , when the sample sizes are equal, even if the data distributions FX and FY have unequal even order cumulants, so long as they have equal odd higher order (third order and higher) cumulants, permutation testing controls Type I error rate. This precise condition is the basis for the subsequent papers Xu and Hsu (2007) and Calian et al. (2008) to uncover the Marginal-Determines-the Joint (MDJ) distribution condition needed for permutation multiple testing to control multiple testing error rates. Regardless of sample sizes, permutation multiple tests may not control false discoveries of which features are predictive of phenotype, unless it is assumed that the joint distributions of nonpredictive features are identical between the X and Y groups. Checking this assumption on the joint distribution using the data
[1]
Jin Xu,et al.
Robustified MANOVA with applications in detecting differentially expressed genes from oligonucleotide arrays
,
2008,
Bioinform..
[2]
Kathleen F. Kerr,et al.
Comments on the analysis of unbalanced microarray data
,
2009,
Bioinform..
[3]
B A English,et al.
Multivariate permutation analysis associates multiple polymorphisms with subphenotypes of major depression
,
2008,
Genes, brain, and behavior.
[4]
James F Troendle,et al.
Multiple Testing with Minimal Assumptions
,
2008,
Biometrical journal. Biometrische Zeitschrift.
[5]
Yifan Huang,et al.
To permute or not to permute
,
2006,
Bioinform..
[6]
Violeta Calian,et al.
Partitioning to Uncover Conditions for Permutation Tests to Control Multiple Testing Error Rates
,
2008,
Biometrical journal. Biometrische Zeitschrift.
[7]
J. Hsu,et al.
Applying the Generalized Partitioning Principle to Control the Generalized Familywise Error Rate
,
2007,
Biometrical journal. Biometrische Zeitschrift.
[8]
Xin Gao,et al.
Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments
,
2006,
Bioinform..
[9]
F Pesarin,et al.
Testing Marginal Homogeneity Against Stochastic Order in Multivariate Ordinal Data
,
2009,
Biometrics.