论文信息 - Generating GO Slim Using Relational Database Management Systems to Support Proteomics Analysis

Generating GO Slim Using Relational Database Management Systems to Support Proteomics Analysis

The Gene Ontology Consortium built the Gene Ontology database (GO) to address the need for a common standard in naming genes and gene products. Using different names for the same concepts and different concepts with the same name makes it effectively impossible for humans and computers alike to analyze biological processes across different organisms. The consortium addresses this need by defining terms for categorizing genes and gene products. A convention in GO is that each gene or gene product is annotated to the most specific GO term in the GO database. It is, however, also useful for researchers to be able to group genesor gene products into broad biological categories that give a higher-level view of their function when analyzing results of an experiment. A GO Slim is a subset of the GO ontology that provides such a higher-level view of functions. Existing GO Slim generation tools have two important limitations: programming language dependence, and an inability to dynamically generate a GO Slim while analyzing. We have extended the relational database engine to dynamically generate a GO Slim overcoming this limitations. Using this extension, we have developed a tool (Dynamic GOSlim) that dynamically generates a GO Slim and uses the generated GO Slim to categorize genes or gene products. This tool is being used in an ongoing proteomics project aimed at identifying possible oral cancer biomarkers in saliva.

John V. Carlis | Hongwei Xie | Getiria Onsongo | Timothy J. Griffin

[1] Nan Wang,et al. AgBase: a functional genomics resource for agriculture , 2006, BMC Genomics.

[2] M. Ashburner,et al. Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.