Cross-Ontological Analytics: Combining Associative and Hierarchical Relations in the Gene Ontologies to Assess Gene Product Similarity

Gene and gene product similarity is a fundamental diagnostic measure in analyzing biological data and constructing predictive models for functional genomics. With the rising influence of the gene ontologies, two complementary approaches have emerged where the similarity between two genes/gene products is obtained by comparing gene ontology (GO) annotations associated with the gene/gene products. One approach captures GO-based similarity in terms of hierarchical relations within each gene ontology. The other approach identifies GO-based similarity in terms of associative relations across the three gene ontologies. We propose a novel methodology where the two approaches can be merged with ensuing benefits in coverage and accuracy.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[3]  M. Kendall,et al.  Rank and product-moment correlation. , 1949, Biometrika.

[4]  Jarek Nieplocha,et al.  ScalaBLAST: A Scalable Implementation of BLAST for High-Performance Data-Intensive Bioinformatics Analysis , 2006, IEEE Transactions on Parallel and Distributed Systems.

[5]  James M. Keller,et al.  Gene Ontology Automatic Annotation Using a Domain Based Gene Product Similarity Measure , 2005, The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05..

[6]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[7]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8]  Pedro M. Coutinho,et al.  Implementation of a Functional Semantic Similarity Measure between Gene-Products , 2003 .

[9]  Olivier Bodenreider,et al.  Ontology-driven similarity approaches to supporting gene func- tional assessment , 2005 .

[10]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[11]  Carole A. Goble,et al.  Semantic Similarity Measures as Tools for Exploring the Gene Ontology , 2002, Pacific Symposium on Biocomputing.

[12]  Olivier Bodenreider,et al.  Non-Lexical Approaches to Identifying Associative Relations in the Gene Ontology , 2004, Pacific Symposium on Biocomputing.

[13]  Jeffrey T. Chang,et al.  The computational analysis of scientific literature to define and recognize gene expression clusters. , 2003, Nucleic acids research.

[14]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[15]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[16]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.