Logical comparison over RDF resources in bio-informatics

Comparison of resources is a frequent task in different bio-informatics applications, including drug-target interaction, drug repositioning and mechanism of action understanding, among others. This paper proposes a general method for the logical comparison of resources modeled in Resource Description Framework and shows its distinguishing features with reference to the comparison of drugs. In particular, the method returns a description of the commonalities between resources, rather than a numerical value estimating their similarity and/or relatedness. The approach is domain-independent and may be flexibly adapted to heterogeneous use cases, according to a process for setting parameters which is completely explicit. The paper also presents an experiment using the dataset Bioportal as knowledge source; the experiment is fully reproducible, thanks to the elicitation of criteria and values for parameter customization.

[1]  Derek Greene,et al.  Unsupervised graph-based topic labelling using dbpedia , 2013, WSDM.

[2]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[3]  David Sánchez,et al.  An ontology-based measure to compute semantic similarity in biomedicine , 2011, J. Biomed. Informatics.

[4]  Francesco M. Donini,et al.  Defining and Computing Least Common Subsumers in RDF , 2016, J. Web Semant..

[5]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[6]  Daniel L. Rubin,et al.  A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations , 2014, J. Biomed. Informatics.

[7]  Anni-Yasmin Turhan,et al.  Most Specific Generalizations w.r.t. General EL-TBoxes , 2013, IJCAI.

[8]  David Sánchez,et al.  Enabling semantic similarity estimation across multiple ontologies: An evaluation in the biomedical domain , 2012, J. Biomed. Informatics.

[9]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[10]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[11]  Simone Paolo Ponzetto,et al.  Knowledge-based graph document modeling , 2014, WSDM.

[12]  Alexander Borgida,et al.  Computing Least Common Subsumers in Description Logics , 1992, AAAI.

[13]  Chengqi Zhang,et al.  Missing Value Imputation Based on Data Clustering , 2008, Trans. Comput. Sci..

[14]  Tien-Tuan Dao,et al.  Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics , 2013, J. Biomed. Informatics.

[15]  David Sánchez,et al.  Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective , 2011, J. Biomed. Informatics.

[16]  Maciej M. Syslo,et al.  Efficient Computations in Tree-Like Graphs , 1990 .

[17]  Murodzhon Akhmedov,et al.  A drug similarity network for understanding drug mechanism of action , 2014, J. Bioinform. Comput. Biol..

[18]  Bruno Courcelle,et al.  Graph Structure and Monadic Second-Order Logic - A Language-Theoretic Approach , 2012, Encyclopedia of mathematics and its applications.

[19]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[20]  Carlo Batini,et al.  Data Quality Issues in Linked Open Data , 2016 .

[21]  Hao Ding,et al.  Similarity-based machine learning methods for predicting drug-target interactions: a brief review , 2014, Briefings Bioinform..

[22]  Francesco M. Donini,et al.  A Logic-Based Approach to Named-Entity Disambiguation in the Web of Data , 2015, AI*IA.

[23]  Franz Baader,et al.  On the Problem of Computing Small Representations of Least Common Subsumers , 2002, KI.

[24]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[25]  Deendayal Dinakarpandian,et al.  Finding disease similarity based on implicit semantic similarity , 2012, J. Biomed. Informatics.

[26]  Y. Fukuoka,et al.  A two-step drug repositioning method based on a protein-protein interaction network of genes shared by two diseases and the similarity of drugs , 2013, Bioinformation.

[27]  Mark A. Musen,et al.  BioPortal as a dataset of linked biomedical ontologies and terminologies in RDF , 2013, Semantic Web.

[28]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[29]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[30]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[31]  Ping Zhang,et al.  Towards Drug Repositioning: A Unified Computational Framework for Integrating Multiple Aspects of Drug Similarity and Disease Similarity , 2014, AMIA.

[32]  Johannes Fürnkranz,et al.  Unsupervised generation of data mining features from linked open data , 2012, WIMS '12.

[33]  David Sánchez,et al.  A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain , 2014, J. Biomed. Informatics.