Graph-based analysis and visualization of experimental results with ONDEX

MOTIVATION Assembling the relevant information needed to interpret the output from high-throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant changes in expression levels, but more information is needed to determine their biological relevance. The challenge is to bring these genes together with biological information distributed across hundreds of databases or buried in the scientific literature (millions of articles). Software tools are needed to automate this task which at present is labor-intensive and requires considerable informatics and biological expertise. RESULTS This article describes ONDEX and how it can be applied to the task of interpreting gene expression results. ONDEX is a database system that combines the features of semantic database integration and text mining with methods for graph-based analysis. An overview of the ONDEX system is presented, concentrating on recently developed features for graph-based analysis and visualization. A case study is used to show how ONDEX can help to identify causal relationships between stress response genes and metabolic pathways from gene expression data. ONDEX also discovered functional annotations for most of the genes that emerged as significant in the microarray experiment, but were previously of unknown function.

[1]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[2]  Antje Chang,et al.  New Developments , 2003 .

[3]  Christopher J. Rawlings,et al.  Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalized Data Structures , 2004, Silico Biol..

[4]  Jacob Köhler,et al.  Integration of life science databases , 2004 .

[5]  S. Rhee,et al.  AraCyc: A Biochemical Pathway Database for Arabidopsis1 , 2003, Plant Physiology.

[6]  Masao Nagasaki,et al.  Genomic Object Net: II. Modelling biopathways by hybrid functional Petri net with extension. , 2003, Applied bioinformatics.

[7]  Matej Oresic,et al.  Data integration and visualization system for enabling conceptual biology , 2005, ISMB.

[8]  Sergei Egorov,et al.  Pathway studio - the analysis and navigation of molecular networks , 2003, Bioinform..

[9]  Barry Smith,et al.  BMC Bioinformatics Methodology article , 2005 .

[10]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[11]  Christopher J. Rawlings,et al.  PHI-base: a new database for pathogen host interactions , 2005, Nucleic Acids Res..

[12]  S. Goldman,et al.  Microarray analysis of nitric oxide responsive transcripts in Arabidopsis. , 2004, Plant biotechnology journal.

[13]  Edgar Wingender,et al.  TRANSFAC, TRANSPATH and CYTOMER as starting points for an ontology of regulatory networks. , 2004, In silico biology.

[14]  Emek Demir,et al.  PATIKA: an integrated visual environment for collaborative construction and analysis of cellular pathways , 2002, Bioinform..

[15]  Hur-Song Chang,et al.  Transcriptome Changes for Arabidopsis in Response to Salt, Osmotic, and Cold Stress1,212 , 2002, Plant Physiology.

[16]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[17]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[18]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[19]  S. Rhee,et al.  MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. , 2004, The Plant journal : for cell and molecular biology.

[20]  Anne Morgat,et al.  Integration of data and methods for genome analysis. , 2003, Current opinion in drug discovery & development.

[21]  Carlos Alberto Heuser,et al.  Integrating Biological Databases , 2003, SBBD.