Database and Expert Systems Applications

The increasing number of RDF data sources that allow for querying Linked Data via Web services form the basis for federated SPARQL query processing. Federated SPARQL query engines provide a unified view of a federation of RDF data sources, and rely on source descriptions for selecting the data sources over which unified queries will be executed. Albeit efficient, existing federated SPARQL query engines usually ignore the meaning of data accessible from a data source, and describe sources only in terms of the vocabularies utilized in the data source. Lack of source description may conduce to the erroneous selection of data sources for a query, thus affecting the performance of query processing over the federation. We tackle the problem of federated SPARQL query processing and devise MULDER, a query engine for federations of RDF data sources. MULDER describes data sources in terms of RDF molecule templates, i.e., abstract descriptions of entities belonging to the same RDF class. Moreover, MULDER utilizes RDF molecule templates for source selection, and query decomposition and optimization. We empirically study the performance of MULDER on existing benchmarks, and compare MULDER performance with stateof-the-art federated SPARQL query engines. Experimental results suggest that RDF molecule templates empower MULDER federated query processing, and allow for the selection of RDF data sources that not only reduce execution time, but also increase answer completeness.

[1]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[2]  Jaime Arguello,et al.  To Blend or Not to Blend?: Perceptual Speed, Visual Memory and Aggregated Search , 2016, SIGIR.

[3]  Yi Li,et al.  RiMOM: A Dynamic Multistrategy Ontology Alignment Framework , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  Richi Nayak,et al.  Automatic integration of Heterogenous XML-schemas , 2004, iiWAS.

[5]  Cosmin Stroe,et al.  AgreementMaker: Efficient Matching for Large Real-World Schemas and Ontologies , 2009, Proc. VLDB Endow..

[6]  Richi Nayak,et al.  Element similarity measures in XML schema matching , 2010, Inf. Sci..

[7]  Richi Nayak,et al.  A Progressive Clustering Algorithm to Group the XML Data by Structural and Semantic Similarity , 2007, Int. J. Pattern Recognit. Artif. Intell..

[8]  David Fairbairn,et al.  Assessing similarity matching for possible integration of feature classifications of geospatial data from official and informal sources , 2012, Int. J. Geogr. Inf. Sci..

[9]  Richi Nayak,et al.  XML Schema Element Similarity Measures: A Schema Matching Context , 2009, OTM Conferences.

[10]  Mansur R. Kabuka,et al.  Ontology matching with semantic verification , 2009, J. Web Semant..

[11]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[12]  Gunter Saake,et al.  Improving XML schema matching performance using Prüfer sequences , 2009, Data Knowl. Eng..

[13]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[14]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[15]  Dejing Dou,et al.  Ontology Matching with Knowledge Rules , 2015, Trans. Large Scale Data Knowl. Centered Syst..

[16]  Zohra Bellahsene,et al.  Overview of YAM++ - (not) Yet Another Matcher for ontology alignment task , 2016, J. Web Semant..

[17]  Kristin Tufte,et al.  Merge as a Lattice-Join of XML Documents , 2002 .

[18]  Nasser Yazdani,et al.  Ontology Matching Using Vector Space , 2008, ECIR.

[19]  Masaki Aono,et al.  An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size , 2009, J. Web Semant..

[20]  Yuzhong Qu,et al.  Matching large ontologies: A divide-and-conquer approach , 2008, Data Knowl. Eng..

[21]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[22]  Huynh Quyet Thang,et al.  XML Schema Automatic Matching Solution , 2010 .

[23]  Gunter Saake,et al.  A Sequence-based Ontology Matching Approach , 2008 .

[24]  Nuwee Wiwatwattana,et al.  SAXM : Semi-automatic XML Schema Mapping , 2009 .

[25]  Maribel Acosta,et al.  ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints , 2011, SEMWEB.

[26]  Dan J. Smith,et al.  Hierarchical Approach for Datatype Matching in XML Schemas , 2007, 24th British National Conference on Databases (BNCOD'07).

[27]  Patrick Lambrix,et al.  SAMBO - A system for aligning and merging biomedical ontologies , 2006, J. Web Semant..