Towards integrative gene functional similarity measurement

BackgroundIn Gene Ontology, the "Molecular Function" (MF) categorization is a widely used knowledge framework for gene function comparison and prediction. Its structure and annotation provide a convenient way to compare gene functional similarities at the molecular level. The existing gene similarity measures, however, solely rely on one or few aspects of MF without utilizing all the rich information available including structure, annotation, common terms, lowest common parents.ResultsWe introduce a rank-based gene semantic similarity measure called InteGO by synergistically integrating the state-of-the-art gene-to-gene similarity measures. By integrating three GO based seed measures, InteGO significantly improves the performance by about two-fold in all the three species studied (yeast, Arabidopsis and human).ConclusionsInteGO is a systematic and novel method to study gene functional associations. The software and description are available at http://www.msu.edu/~jinchen/InteGO.

[1]  Rupert G. Miller Simultaneous Statistical Inference , 1966 .

[2]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[3]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[4]  Göran Goldkuhl,et al.  Method intergration: the need for a learning perspective , 1998, IEE Proc. Softw..

[5]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[6]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[7]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[8]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[9]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[10]  C. Claudel-Renard,et al.  Enzyme-specific profiles for genome annotation: PRIAM. , 2003, Nucleic acids research.

[11]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[12]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[13]  P. Karp Call for an enzyme genomics initiative , 2004, Genome Biology.

[14]  Douglas P. Wiens,et al.  MATCH - A Software Package for Robust Profile Matching Using S-Plus , 2004 .

[15]  Padhraic Smyth,et al.  Analysis and Visualization of Network Data using JUNG , 2005 .

[16]  Frank Holstege,et al.  Predicting gene function through systematic analysis and quality assessment of high-throughput data , 2005, Bioinform..

[17]  Zhiyong Lu,et al.  GO Molecular Function Terms Are Predictive of Subcellular Localization , 2004, Pacific Symposium on Biocomputing.

[18]  Angel Rubio,et al.  Correlation between Gene Expression and GO Semantic Similarity , 2005, TCBB.

[19]  D. Nutt,et al.  Translocator protein (18kDa): new nomenclature for the peripheral-type benzodiazepine receptor based on its structure and molecular function. , 2006, Trends in pharmacological sciences.

[20]  J. J. Díaz-Mejía,et al.  A network perspective on the evolution of metabolism by gene duplication , 2007, Genome Biology.

[21]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[22]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[23]  Haiyuan Yu,et al.  Developing a similarity measure in biological function space , 2007 .

[24]  Zheng Guo,et al.  Globally predicting protein functions based on co-expressed protein-protein interaction networks and ontology taxonomy similarities. , 2007, Gene.

[25]  Mark Gerstein,et al.  Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications , 2007, Bioinform..

[26]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[27]  Yves A. Lussier,et al.  Evaluation of high-throughput functional categorization of human disease genes , 2007, BMC Bioinformatics.

[28]  K. Dolinski,et al.  Use and misuse of the gene ontology annotations , 2008, Nature Reviews Genetics.

[29]  Catia Pesquita,et al.  Metrics for GO based protein semantic similarity: a systematic evaluation , 2008, BMC Bioinformatics.

[30]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[31]  David Sánchez,et al.  An ontology-based measure to compute semantic similarity in biomedicine , 2011, J. Biomed. Informatics.

[32]  Diyuan Yang,et al.  An integration strategy to measure enzyme activities for detecting irreversible inhibitors with dimethoate on butyrylcholinesterase as a model , 2011 .

[33]  Igor Jurisica,et al.  Novel semantic similarity measure improves an integrative approach to predicting gene functional associations , 2013, BMC Systems Biology.

[34]  Xiaoyan Liu,et al.  Measuring gene functional similarity based on group-wise comparison of GO terms , 2013, Bioinform..

[35]  Yadong Wang,et al.  Identifying cross-category relations in gene ontology and constructing genome-specific term association networks , 2013, BMC Bioinformatics.

[36]  Xiaomei Wu,et al.  Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method , 2013, PloS one.