A global test for groups of genes: testing association with a clinical outcome

MOTIVATION This paper presents a global test to be used for the analysis of microarray data. Using this test it can be determined whether the global expression pattern of a group of genes is significantly related to some clinical outcome of interest. Groups of genes may be any size from a single gene to all genes on the chip (e.g. known pathways, specific areas of the genome or clusters from a cluster analysis). RESULT The test allows groups of genes of different size to be compared, because the test gives one p-value for the group, not a p-value for each gene. Researchers can use the test to investigate hypotheses based on theory or past research or to mine gene ontology databases for interesting pathways. Multiple testing problems do not occur unless many groups are tested. Special attention is given to visualizations of the test result, focussing on the associations between samples and showing the impact of individual genes on the test result. AVAILABILITY An R-package globaltest is available from http://www.bioconductor.org

[1]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Paul H. C. Eilers,et al.  Classification of microarray data with penalized logistic regression , 2001, SPIE BiOS.

[3]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[4]  F R Rosendaal,et al.  Testing familial aggregation. , 1995, Biometrics.

[5]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[6]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[9]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[10]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  H C van Houwelingen,et al.  Testing the fit of a regression model via score tests in random effects models. , 1995, Biometrics.

[13]  S. le Cessie,et al.  Testing the fit of a regression model via score tests in random effects models. , 1995 .

[14]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .