A Comparison of Parametric Versus Permutation Methods with Applications to General and Temporal Microarray Gene Expression Data

MOTIVATION In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. RESULTS We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  John W. V. Storey The False Discovery Rate: A Bayesian Interpre-tation and the q-value , 2001 .

[3]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Pierre R. Bushel,et al.  STATISTICAL ANALYSIS OF A GENE EXPRESSION MICROARRAY EXPERIMENT WITH REPLICATION , 2002 .

[5]  D. Jones,et al.  Adjustments and measures of differential expression for microarray data , 2002, Bioinform..

[6]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[7]  Jinyan Li,et al.  Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. , 2002 .

[8]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[9]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[11]  Shin Ta Liu,et al.  Permutation Methods: A Distance Function Approach , 2002, Technometrics.

[12]  Marek Svoboda,et al.  Temporal gene expression profile of human precursor B leukemia cells induced by adhesion receptor: identification of pathways regulating B-cell survival. , 2003, Blood.