Complex functionality of gene groups identified from high-throughput data.

Relating experimental data to biological knowledge is necessary to cope with the avalanches of new data emerging from recent developments in high-throughput technologies. Automatic functional profiling becomes the de facto standard approach for the secondary analysis of high-throughput data. A number of tools employing available gene functional annotations have been developed for this purpose. However, current annotations are derived mostly from traditional analysis of the individual gene function. The complex biological phenomena carried out by the concerted activity of many genes often requires the definition of new complex functionality (related to a group of genes), which is, in many cases, not available in current annotation vocabularies. Functional profiling with annotation terms related to the description of individual biological functions of a gene may fail to provide reasonable interpretation of biological relationships in a set of genes involved in complex biological phenomena. We introduce a novel procedure to profile a complex functionality of a gene set. Complex functionality is constructed as a combination of available annotation terms. By profiling ChIP-chip data from Saccharomyces cerevisiae we demonstrate that this technique produces deeper insights into the results of high-throughput experiments that are beyond the known facts described in the functional classifications.

[1]  Mikhail S. Gelfand,et al.  Mining sequence annotation databanks for association patterns , 2005, Bioinform..

[2]  Purvesh Khatri,et al.  Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments , 2004, Nucleic Acids Res..

[3]  Purvesh Khatri,et al.  A semantic analysis of the annotations of the human genome , 2005, Bioinform..

[4]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[5]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[6]  Mikhail J. Atallah,et al.  Algorithms and Theory of Computation Handbook , 2009, Chapman & Hall/CRC Applied Algorithms and Data Structures series.

[7]  Bing Zhang,et al.  WebGestalt: an integrated system for exploring gene sets in various biological contexts , 2005, Nucleic Acids Res..

[8]  Francesco Pinciroli,et al.  GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining , 2004, Nucleic Acids Res..

[9]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  P. Uetz,et al.  Systematic and large-scale two-hybrid screens. , 2000, Current opinion in microbiology.

[12]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[13]  D. Eisenberg,et al.  Use of Logic Relationships to Decipher Protein Network Organization , 2004, Science.

[14]  Chris Sander,et al.  Characterizing gene sets with FuncAssociate , 2003, Bioinform..

[15]  E. Sprinzak,et al.  Utilizing logical relationships in genomic data to decipher cellular processes , 2005, The FEBS journal.

[16]  M. Spector,et al.  Hir1p and Hir2p function as transcriptional corepressors to regulate histone gene transcription in the Saccharomyces cerevisiae cell cycle , 1997, Molecular and cellular biology.

[17]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[18]  Igor V. Tetko,et al.  A systematic approach to infer biological relevance and biases of gene network structures , 2006, Nucleic acids research.

[19]  D. Chandler,et al.  Analysis of protein interaction and function with a 3-dimensional MALDI-MS protein array. , 2005, BioTechniques.

[20]  May D. Wang,et al.  GoMiner: a resource for biological interpretation of genomic and proteomic data , 2003, Genome Biology.

[21]  W. H. Mager,et al.  Multifunctional DNA-binding proteins mediate concerted transcription activation of yeast ribosomal protein genes. , 1990, Biochimica et biophysica acta.

[22]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[23]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[24]  P. Khatri,et al.  Profiling gene expression using onto-express. , 2002, Genomics.

[25]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[26]  Purvesh Khatri,et al.  Recent additions and improvements to the Onto-Tools , 2005, Nucleic Acids Res..

[27]  H. Mewes,et al.  BIOREL: The benchmark resource to estimate the relevance of the gene networks , 2006, FEBS letters.

[28]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[30]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[31]  Geoffrey I. Webb Efficient search for association rules , 2000, KDD '00.

[32]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[33]  Gregory R. Grant,et al.  Statistical Methods in Bioinformatics , 2001 .