ADGO: analysis of differentially expressed gene sets using composite GO annotation

MOTIVATION Genes are typically expressed in modular manners in biological processes. Recent studies reflect such features in analyzing gene expression patterns by directly scoring gene sets. Gene annotations have been used to define the gene sets, which have served to reveal specific biological themes from expression data. However, current annotations have limited analytical power, because they are classified by single categories providing only unary information for the gene sets. RESULTS Here we propose a method for discovering composite biological themes from expression data. We intersected two annotated gene sets from different categories of Gene Ontology (GO). We then scored the expression changes of all the single and intersected sets. In this way, we were able to uncover, for example, a gene set with the molecular function F and the cellular component C that showed significant expression change, while the changes in individual gene sets were not significant. We provided an exemplary analysis for HIV-1 immune response. In addition, we tested the method on 20 public datasets where we found many 'filtered' composite terms the number of which reached approximately 34% (a strong criterion, 5% significance) of the number of significant unary terms on average. By using composite annotation, we can derive new and improved information about disease and biological processes from expression data. AVAILABILITY We provide a web application (ADGO: http://array.kobic.re.kr/ADGO) for the analysis of differentially expressed gene sets with composite GO annotations. The user can analyze Affymetrix and dual channel array (spotted cDNA and spotted oligo microarray) data for four species: human, mouse, rat and yeast. CONTACT chu@kribb.re.kr SUPPLEMENTARY INFORMATION http://array.kobic.re.kr/ADGO.

[1]  William Stafford Noble,et al.  Exploring Gene Expression Data with Class Scores , 2001, Pacific Symposium on Biocomputing.

[2]  Wei Pan,et al.  A mixture model approach to detecting differentially expressed genes with microarray data , 2003, Functional & Integrative Genomics.

[3]  Richard A Lempicki,et al.  HIV envelope induces a cascade of cell signals in non-proliferating target cells that favor virus replication , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[5]  J. Thomas,et al.  An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. , 2001, Genome research.

[6]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[8]  Paul Pavlidis,et al.  ErmineJ: Tool for functional analysis of gene expression data sets , 2005, BMC Bioinformatics.

[9]  Joaquín Dopazo,et al.  Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information , 2005, Bioinform..

[10]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  Mingzhu Zhu,et al.  MEGO: gene functional module expression based on gene ontology. , 2005, BioTechniques.

[13]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[14]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Daniel J. Vis,et al.  T-profiler: scoring the activity of predefined groups of genes using gene expression data , 2005, Nucleic Acids Res..

[16]  D. Koller,et al.  A module map showing conditional activity of expression modules in cancer , 2004, Nature Genetics.

[17]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.