Ontology-based Queries over Cancer Data

The ever-increasing amount of data in biomedical research, and in cancer research in particular, needs to be managed to support efficient data access, exchange and integration. Existing software infrastructures, such caGrid, support access to distributed information annotated with a domain ontology. However, caGrid’s current querying functionality depends on the structure of individual data resources without exploiting the semantic annotations. In this paper, we present the design and development of an ontology-based querying functionality that consists of: the generation of OWL2 ontologies from the underlying data resources metadata and a query rewriting and translation process based on reasoning, which converts a query at the domain ontology level into queries at the software infrastructure level. We present a detailed analysis of our approach as well as an extensive performance evaluation. While the implementation and evaluation was performed for the caGrid infrastructure, the approach could be applicable to other model and metadatadriven environments for data sharing.

[1]  Ralph Hodgson,et al.  Adaptive information - improving business through semantic interoperability, grid computing, and enterprise integration , 2004, Wiley series in systems engineering and management.

[2]  Bijan Parsia,et al.  Finding All Justifications of OWL DL Entailments , 2007, ISWC/ASWC.

[3]  Anthony Finkelstein,et al.  Domain concept-based queries for cancer research data sources , 2009, 2009 22nd IEEE International Symposium on Computer-Based Medical Systems.

[4]  Michael Krauthammer,et al.  Semantic web data warehousing for caGrid , 2009, BMC Bioinformatics.

[5]  Carole A. Goble,et al.  Query processing with description logic ontologies over object-wrapped databases , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[6]  David Maier,et al.  Optimizing object queries using an effective calculus , 2000, TODS.

[7]  Ulrike Sattler,et al.  Which Kind of Module Should I Extract? , 2009, Description Logics.

[8]  Vladan Devedzic,et al.  MDA-based Automatic OWL Ontology Development , 2006, International Journal on Software Tools for Technology Transfer.

[9]  Joel H. Saltz,et al.  caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid , 2006, Bioinform..

[10]  Huajun Chen,et al.  DartGrid: a semantic infrastructure for building database Grid applications , 2006, Concurr. Comput. Pract. Exp..

[11]  Robert Stevens,et al.  Putting OWL in Order: Patterns for Sequences in OWL , 2006, OWLED.

[12]  Boris Motik,et al.  OWL 2: The next step for OWL , 2008, J. Web Semant..

[13]  Diego Calvanese,et al.  Reasoning on UML class diagrams , 2005, Artif. Intell..

[14]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[15]  Jennifer Golbeck,et al.  Modeling a description logic vocabulary for cancer research , 2005, J. Biomed. Informatics.

[16]  Manolis Tsiknakis,et al.  A Semantic Grid Infrastructure Enabling Integrated Access and Analysis of Multilevel Biomedical Data in Support of Postgenomic Clinical Trials on Cancer , 2008, IEEE Transactions on Information Technology in Biomedicine.

[17]  Francisco J. García-Peñalvo,et al.  A Survey on Ontology Metrics , 2010, WSKS.