Discovering Relations Among GO-Annotated Clusters by Graph Kernel Methods

The biological interpretation of large-scale gene expression data is one of the challenges in current bioinformatics. The state-of-theart approach is to perform clustering and then compute a functional characterization via enrichments by Gene Ontology terms [1]. To better assist the interpretation of results, it may be useful to establish connections among different clusters. This machine learning step is sometimes termed cluster meta-analysis, and several approaches have already been proposed; in particular, they usually rely on enrichments based on flat lists of GO terms. However, GO terms are organized in taxonomical graphs, whose structure should be taken into account when performing enrichment studies. To tackle this problem, we propose a kernel approach that can exploit such structured graphical nature. Finally, we compare our approach against a specific flat list method by analyzing the cdc15- subset of the well known Spellman's Yeast Cell Cycle dataset [2].

[1]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[2]  Naren Ramakrishnan,et al.  Reconstructing formal temporal models of cellular events using the GO process ontology , 2005 .

[3]  Jason C. Mills,et al.  GOurmet: A tool for quantitative comparison and visualization of gene expression profiles based on gene ontology (GO) distributions , 2006, BMC Bioinformatics.

[4]  Mehryar Mohri,et al.  Positive Definite Rational Kernels , 2003, COLT.

[5]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[6]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[7]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[8]  Peter N. Robinson,et al.  Ontologizing gene-expression microarray data: characterizing clusters with Gene Ontology , 2004, Bioinform..

[9]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[10]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  R. Quigg,et al.  An integrated strategy for the optimization of microarray data interpretation. , 2005, Gene expression.

[13]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[14]  Naren Ramakrishnan,et al.  Remembrance of Experiments Past: Analyzing Time Course Datasets to Discover Complex Temporal Invariants , 2005 .

[15]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[16]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[17]  Rasiah Loganantharaj,et al.  Metric for Measuring the Effectiveness of Clustering of DNA Microarray Expression , 2006, BMC Bioinformatics.

[18]  Bhubaneswar Mishra,et al.  Remembrance of the experiments past: A redescription based tool for discovery in complex systems , 2006 .

[19]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[20]  Joaquín Dopazo,et al.  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes , 2004, Bioinform..

[21]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[22]  Cliff Joslyn,et al.  The Gene Ontology Categorizer , 2004, ISMB/ECCB.

[23]  Ziv Bar-Joseph,et al.  STEM: a tool for the analysis of short time series gene expression data , 2006, BMC Bioinformatics.

[24]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[25]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[26]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[27]  Purvesh Khatri,et al.  A comparison of existing tools for ontological analysis of gene expression data , 2005 .