Identifying Candidate Datasets for Data Interlinking

One of the design principles that can stimulate the growth and increase the usefulness of the Web of data is URIs linkage. However, the related URIs are typically in different datasets managed by different publishers. Hence, the designer of a new dataset must be aware of the existing datasets and inspect their content to define sameAs links. This paper proposes a technique based on probabilistic classifiers that, given a datasets S to be published and a set T of known published datasets, ranks each Ti ∈ T according to the probability that links between S and Ti can be found by inspecting the most relevant datasets. Results from our technique show that the search space can be reduced up to 85%, thereby greatly decreasing the computational effort.

[1]  Antonio L. Furtado,et al.  Instance-Based OWL Schema Matching , 2009, ICEIS.

[2]  K. A. Kuznetsov Scientific data integration system in the linked open data space , 2013, Programming and Computer Software.

[3]  Ioannis Konstas,et al.  On social networks and collaborative recommendation , 2009, SIGIR.

[4]  Bracha Shapira,et al.  Recommender Systems Handbook , 2015, Springer US.

[5]  John Riedl,et al.  Recommender systems in e-commerce , 1999, EC '99.

[6]  Michael Hausenblas,et al.  Describing linked datasets with the VoID vocabulary , 2011 .

[7]  Milan Stankovic,et al.  Linked Data-Based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario , 2012, ESWC.

[8]  Gerhard Friedrich,et al.  Recommender Systems - An Introduction , 2010 .

[9]  Marco A. Casanova,et al.  Matching object catalogues , 2008, Innovations in Systems and Software Engineering.

[10]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[11]  Andriy Nikolov,et al.  Identifying Relevant Sources for Data Linking using a Semantic Web Index , 2011, LDOW.

[12]  Bernardo Pereira Nunes,et al.  Complex matching of RDF datatype properties , 2011, OM.

[13]  Anja Jentzsch,et al.  Augmenting the Web of Data using Referers , 2011, LDOW.

[14]  Tim Weitzel,et al.  Matching People and Jobs: A Bilateral Recommendation Approach , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[15]  Ian Witten,et al.  Data Mining , 2000 .

[16]  Enrico Motta,et al.  What Should I Link to? Identifying Relevant Sources and Classes for Data Linking , 2011, JIST.

[17]  Ana Carolina Salgado,et al.  Using information quality for the identification of relevant web data sources: a proposal , 2012, IIWAS '12.

[18]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[19]  Bernadette Farias Lóscio,et al.  Feedback-based data set recommendation for building linked data applications , 2012, I-SEMANTICS '12.