An integrated information-based similarity measurement of gene ontology terms

Measuring the semantic similarity between pairs of terms in Gene Ontology (GO) can help to compare genes that can not be compared by other computational methods. In this study, we proposed an integrated information-based similarity measurement (IISM) to calculate the semantic similarity between two GO terms by taking into account multiple common ancestors that they share, and aggregating the semantic information and depth information of the non-redundant common ancestors. Our method searches for non-redundant common ancestors in an effective way. Validation experiments were conducted on both gene expression dataset and pathway dataset, and the experimental results suggest the superiority of our method against some existing methods.

[1]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[2]  Sanghamitra Bandyopadhyay,et al.  A new path based hybrid measure for gene ontology similarity , 2014, TCBB.

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  Thomas Lengauer,et al.  Improved scoring of functional groups from gene expression data by decorrelating GO graph structure , 2006, Bioinform..

[5]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[6]  Angel Rubio,et al.  Correlation between Gene Expression and GO Semantic Similarity , 2005, TCBB.

[7]  Safaai Deris,et al.  A genetic similarity algorithm for searching the Gene Ontology terms and annotating anonymous protein sequences , 2008, J. Biomed. Informatics.

[8]  Philip S. Yu,et al.  Measure the Semantic Similarity of GO Terms Using Aggregate Information Content , 2013, ISBRA.

[9]  Mário J. Silva,et al.  Disjunctive shared information between ontology concepts: application to Gene Ontology , 2011, J. Biomed. Semant..

[10]  Rui Jiang,et al.  From Ontology to Semantic Similarity: Calculation of Ontology-Based Semantic Similarity , 2013, TheScientificWorldJournal.

[11]  Haixuan Yang,et al.  Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty , 2012, Bioinform..

[12]  Monte Westerfield,et al.  Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation , 2009, PLoS biology.

[13]  Yadong Wang,et al.  Measuring semantic similarities by combining gene ontology annotations and gene co-function networks , 2015, BMC Bioinformatics.

[14]  Jing Zhu,et al.  Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories , 2008, Bioinform..

[15]  Xiaomei Wu,et al.  Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method , 2013, PloS one.

[16]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[17]  Sidahmed Benabderrahmane,et al.  IntelliGO: a new vector-based semantic similarity measure including annotation origin , 2010, BMC Bioinformatics.

[18]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[19]  John Murphy,et al.  Using WordNet as a Knowledge Base for Measuring Semantic Similarity between Words , 1994 .

[20]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[21]  Xiaoyan Liu,et al.  Measuring gene functional similarity based on group-wise comparison of GO terms , 2013, Bioinform..

[22]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[23]  Yan Zhou,et al.  Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data , 2008, BMC Bioinformatics.

[24]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[25]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[26]  Olivier Bodenreider,et al.  Ontology-driven similarity approaches to supporting gene func- tional assessment , 2005 .

[27]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[28]  Ju Han Kim,et al.  Bi-directional semantic similarity for gene ontology to optimize biological and clinical analyses , 2012, J. Am. Medical Informatics Assoc..

[29]  Hisham Al-Mubaid,et al.  A New Path Length Measure Based on GO for Gene Similarity with Evaluation using SGD Pathways , 2008, 2008 21st IEEE International Symposium on Computer-Based Medical Systems.

[30]  C. Mungall,et al.  Gene Ontology Consortium : going forward The Gene Ontology , 2015 .

[31]  Ariel S. Schwartz,et al.  An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man , 2010, Cell.

[32]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[33]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[34]  Simon Kasif,et al.  Probabilistic Protein Function Prediction from Heterogeneous Genome-Wide Data , 2007, PloS one.

[35]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[36]  James Zijun Wang,et al.  Effectively Integrating Information Content and Structural Relationship to Improve the GO-based Similarity Measure Between Proteins , 2010, BIOCOMP.

[37]  Deendayal Dinakarpandian,et al.  Finding disease similarity based on implicit semantic similarity , 2012, J. Biomed. Informatics.

[38]  Huiru Zheng,et al.  Integration of Gene Ontology-based similarities for supporting analysis of protein-protein interaction networks , 2010, Pattern Recognit. Lett..

[39]  Mário J. Silva,et al.  Measuring semantic similarity between Gene Ontology terms , 2007, Data Knowl. Eng..