Holistic Entity Clustering for Linked Data

Pairwise link discovery approaches for the Web of Data do not scale to many sources thereby limiting the potential for data integration. We thus propose a holistic approach for linking many data sources based on a clustering of entities representing the same real-world object. Our clustering approach utilizes existing links and can deal with entities of different semantic types. The approach is able to identify errors in existing links and can find numerous additional links. An initial evaluation on real-world linked data shows the effectiveness of the proposed holistic entity matching.

[1]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[2]  Joshua Ojo Nehinbe,et al.  A Framework for Evaluating Clustering Algorithm , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[3]  Markus Nentwig,et al.  A survey of current Link Discovery frameworks , 2016, Semantic Web.

[4]  Catia Pesquita,et al.  Towards Annotating Potential Incoherences in BioPortal Mappings , 2014, SEMWEB.

[5]  Erhard Rahm The Case for Holistic Data Integration , 2016, ADBIS.

[6]  Gerhard Weikum,et al.  LINDA: distributed web-of-data-scale entity matching , 2012, CIKM.

[7]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[8]  Markus Nentwig,et al.  LinkLion: A Link Repository for the Web of Data , 2014, ESWC.

[9]  P. Patel-Schneider Towards Large-scale Schema And Ontology Matching , 2015 .

[10]  Erhard Rahm,et al.  Towards Large-Scale Schema and Ontology Matching , 2011, Schema Matching and Mapping.

[11]  Divesh Srivastava,et al.  Incremental Record Linkage , 2014, Proc. VLDB Endow..

[12]  Natalya F. Noy,et al.  BioPortal: Ontologies and Integrated Data Resources at the Click of a Mouse , 2009 .

[13]  Renée J. Miller,et al.  Framework for Evaluating Clustering Algorithms in Duplicate Detection , 2009, Proc. VLDB Endow..

[14]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[15]  Wolf-Tilo Balke,et al.  Avoiding Chinese Whispers: Controlling End-to-End Join Quality in Linked Open Data Stores , 2015, WebSci.

[16]  Erhard Rahm,et al.  Composition Methods for Link Discovery , 2013, BTW.

[17]  Felix Naumann,et al.  Holistic and Scalable Ontology Alignment for Linked Open Data , 2012, LDOW.

[18]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[19]  Axel-Cyrille Ngonga Ngomo,et al.  Unsupervised Link Discovery through Knowledge Base Repair , 2014, ESWC.

[20]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[21]  Maria Pershina,et al.  Holistic entity matching across knowledge graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).