Testing for mean and correlation changes in microarray experiments: an application for pathway analysis

BackgroundMicroarray experiments examine the change in transcript levels of tens of thousands of genes simultaneously. To derive meaningful data, biologists investigate the response of genes within specific pathways. Pathways are comprised of genes that interact to carry out a particular biological function. Existing methods for analyzing pathways focus on detecting changes in the mean or over-representation of the number of differentially expressed genes relative to the total of genes within the pathway. The issue of how to incorporate the influence of correlation among the genes is not generally addressed.ResultsIn this paper, we propose a non-parametric rank test for analyzing pathways that takes into account the correlation among the genes and compared two existing methods, Global and Gene Set Enrichment Analysis (GSEA), using two publicly available data sets. A simulation study was conducted to demonstrate the advantage of the rank test method.ConclusionsThe data indicate the advantages of the rank test. The method can distinguish significant changes in pathways due to either correlations or changes in the mean or both. From the simulation study the rank test out performed Global and GSEA. The greatest gain in performance was for the sample size case which makes the application of the rank test ideal for microarray experiments.

[1]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[2]  Purvesh Khatri,et al.  Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate , 2003, Nucleic Acids Res..

[3]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[4]  P. Khatri,et al.  Profiling gene expression using onto-express. , 2002, Genomics.

[5]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[6]  D. Damian,et al.  Statistical concerns about the GSEA procedure , 2004, Nature Genetics.

[7]  Peter J. Park,et al.  A multivariate approach for integrating genome-wide expression data and biological knowledge , 2006, Bioinform..

[8]  F R Rosendaal,et al.  Testing familial aggregation. , 1995, Biometrics.

[9]  Qi Liu,et al.  BMC Bioinformatics BioMed Central Methodology article Comparative evaluation of gene-set analysis methods , 2007 .

[10]  Andrew Williams,et al.  Hepatic gene expression changes in hypothyroid juvenile mice: characterization of a novel negative thyroid-responsive element. , 2007, Endocrinology.

[11]  C. Meier,et al.  Hypothyroidism and thyroxin substitution affect the n−3 fatty acid composition of rat liver mitochondria , 1991, Lipids.

[12]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  F. L. Hoch Lipids and thyroid hormones. , 1988, Progress in lipid research.

[14]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[15]  Paul D. Feigin,et al.  Intergroup Diversity and Concordance for Ranking Data: An Approach via Metrics for Permutations , 1986 .

[16]  P. Khatri,et al.  Global functional profiling of gene expression. , 2003, Genomics.

[17]  BMC Bioinformatics , 2005 .

[18]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[19]  H C van Houwelingen,et al.  Testing the fit of a regression model via score tests in random effects models. , 1995, Biometrics.

[20]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[21]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[22]  Andrew Williams,et al.  Induction of the interleukin 6/ signal transducer and activator of transcription pathway in the lungs of mice sub-chronically exposed to mainstream tobacco smoke , 2009, BMC Medical Genomics.

[23]  Hao Wu,et al.  MAANOVA: A Software Package for the Analysis of Spotted cDNA Microarray Experiments , 2003 .

[24]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  M. Kanehisa A database for post-genome analysis. , 1997, Trends in genetics : TIG.

[26]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[27]  A. Tohei Studies on the functional relationship between thyroid, adrenal and gonadal hormones. , 2004, The Journal of reproduction and development.

[28]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[29]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.