Integrating Ontological Knowledge and Textual Evidence in Estimating Gene and Gene Product Similarity

With the rising influence of the Gene Ontology, new approaches have emerged where the similarity between genes or gene products is obtained by comparing Gene Ontology code annotations associated with them. So far, these approaches have solely relied on the knowledge encoded in the Gene Ontology and the gene annotations associated with the Gene Ontology database. The goal of this paper is to demonstrate that improvements to these approaches can be obtained by integrating textual evidence extracted from relevant biomedical literature.

[1]  Olivier Bodenreider,et al.  Ontology-driven similarity approaches to supporting gene func- tional assessment , 2005 .

[2]  Alexander Budanitsky,et al.  Lexical Semantic Relatedness and Its Application in Natural Language Processing , 1999 .

[3]  Miguel A. Andrade-Navarro,et al.  Automatic Annotation for Biological Sequences by Etraction of Keywords from MEDLINE Abstracts: Development of a Prototype System , 1997, ISMB.

[4]  Miguel A. Andrade-Navarro Position-Specific Annotation of Protein Function Based on Multiple Homologs , 1999, ISMB.

[5]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[6]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[7]  J. Pustejovsky,et al.  Medstract : Creating Large-scale Information Servers for biomedical libraries , 2002 .

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[10]  Malik Beshir Malik,et al.  Applied Linear Regression , 2005, Technometrics.

[11]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[13]  Michael Schroeder,et al.  GoPubMed: exploring PubMed with the Gene Ontology , 2005, Nucleic Acids Res..

[14]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[15]  Olivier Bodenreider,et al.  Non-Lexical Approaches to Identifying Associative Relations in the Gene Ontology , 2004, Pacific Symposium on Biocomputing.

[16]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[17]  Michael Schroeder,et al.  GoPubMed: ontology-based literature search applied to Gene Ontology and PubMed , 2004, German Conference on Bioinformatics.

[18]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[19]  Ted Pedersen,et al.  Maximizing Semantic Relatedness to Perform Word Sense Disambiguation , 2005 .

[20]  Russ B. Altman,et al.  Including Biological Literature Improves Homology Search , 2001, Pacific Symposium on Biocomputing.

[21]  Wei Luo,et al.  Medstract: creating large-scale information servers from biomedical texts , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[22]  Michael J. E. Sternberg,et al.  SAWTED: Structure Assignment With Text Description-Enhanced detection of remote homologues with automated SWISS-PROT annotation comparisons , 2000, Bioinform..

[23]  Christian Posse,et al.  Cross-Ontological Analytics: Combining Associative and Hierarchical Relations in the Gene Ontologies to Assess Gene Product Similarity , 2006, International Conference on Computational Science.

[24]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[25]  Pedro M. Coutinho,et al.  Implementation of a Functional Semantic Similarity Measure between Gene-Products , 2003 .

[26]  Carole A. Goble,et al.  Semantic Similarity Measures as Tools for Exploring the Gene Ontology , 2002, Pacific Symposium on Biocomputing.