Entity-based Data Source Contextualization for Searching the Web of Data

To allow search on the Web of data, systems have to combine data from multiple sources. However, to effectively fulfill user information needs, systems must be able to “look beyond” exactly matching data sources and offer information from additional/contextual sources (data source contextualization). For this, users should be involved in the source selection process – choosing which sources contribute to their search results. Previous work, however, solely aims at source contextualization for “Web tables”, while relying on schema information and simple relational entities. Addressing these shortcomings, we exploit work from the field of data mining and show how to enable Web data source contextualization. Based on a real-world use case, we built a prototype contextualization engine, which we integrated in a system for searching the Web of data. We empirically validated the effectiveness of our approach – achieving performance gains of up to \(29\) % over the state-of-the-art.

[1]  Rong Jin,et al.  Approximate kernel k-means: solution to large scale kernel clustering , 2011, KDD.

[2]  Stephan Bloehdorn,et al.  Graph Kernels for RDF Data , 2012, ESWC.

[3]  Steffen Staab,et al.  SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions , 2011, COLD.

[4]  Abraham Bernstein,et al.  The Semantic Web - ISWC 2009, 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA, October 25-29, 2009. Proceedings , 2009, SEMWEB.

[5]  Günter Ladwig,et al.  Linked Data Query Processing Strategies , 2010, SEMWEB.

[6]  Jürgen Umbrich,et al.  Comparing data summaries for processing live queries over Linked Data , 2011, World Wide Web.

[7]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[8]  Andriy Nikolov,et al.  Identifying Relevant Sources for Data Linking using a Semantic Web Index , 2011, LDOW.

[9]  Jürgen Umbrich,et al.  Data summaries for on-demand queries over linked data , 2010, WWW '10.

[10]  Peter Haase,et al.  The Information Workbench as a Self-Service Platform for Linked Data Applications , 2011, COLD.

[11]  Andriy Nikolov,et al.  FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation , 2013, International Semantic Web Conference.

[12]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[13]  Alon Y. Halevy,et al.  Principles of Data Integration , 2012 .

[14]  Gustavo Rossi,et al.  Web Engineering , 2001, Lecture Notes in Computer Science.

[15]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[16]  Friedhelm Schwenker,et al.  Clustering large datasets with kernel methods , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[17]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[18]  Bernardo Pereira Nunes,et al.  Identifying Candidate Datasets for Data Interlinking , 2013, ICWE.

[19]  Ian Horrocks,et al.  The Semantic Web – ISWC 2010: 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I , 2010, SEMWEB.

[20]  Katja Hose,et al.  FedX: Optimization Techniques for Federated Query Processing on Linked Data , 2011, SEMWEB.

[21]  Bernadette Farias Lóscio,et al.  Feedback-based data set recommendation for building linked data applications , 2012, I-SEMANTICS '12.

[22]  Alun D. Preece,et al.  Instance Based Clustering of Semantic Web Resources , 2008, ESWC.

[23]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[24]  Achim Rettinger,et al.  Discovering Related Data Sources in Data-Portals , 2013, SemStats@ISWC.

[25]  Reynold Xin,et al.  Finding related tables , 2012, SIGMOD Conference.

[26]  Rong Zhang,et al.  A large scale clustering scheme for kernel K-Means , 2002, Object recognition supported by user interaction for service robots.

[27]  M. Kendall Rank Correlation Methods , 1949 .

[28]  Lora Aroyo,et al.  The Semantic Web – ISWC 2013 , 2013, Lecture Notes in Computer Science.

[29]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.