Automatic Identity Recognition in The Semantic Web

The OKKAM initiative has recently highlighted the need of moving from the traditional web towards a “web of entities”, where real-world objects descriptions could be retrieved, univocally identified, and shared over the web. In this paper, we propose our vision of the entity recognition problem and, in particular, we propose methods and techniques to capture the “identity” of a real entity in the Semantic Web. We claim that automatic techniques are needed to compare different RDF descriptions of a domain with the goal of automatically detect heterogeneous descriptions of the same real-world objects. Problems and techniques to solve them are discussed together with some experimental results on a real case study on web data.

[1]  Jayant Madhavan,et al.  Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[2]  Sabine Maßmann,et al.  Instance Matching with COMA++ , 2007, BTW Workshops.

[3]  Stefan Schlobach,et al.  An Empirical Study of Instance-Based Ontology Matching , 2007, ISWC/ASWC.

[4]  Charles Elkan,et al.  An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records , 1997, DMKD.

[5]  P. Ivax,et al.  A THEORY FOR RECORD LINKAGE , 2004 .

[6]  Felix Naumann,et al.  Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies , 2006, IEEE Data Eng. Bull..

[7]  Surajit Chaudhuri,et al.  Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.

[8]  Chao Wang,et al.  Integration of Ontology Data through Learning Instance Matching , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[9]  W. Winkler IMPROVED DECISION RULES IN THE FELLEGI-SUNTER MODEL OF RECORD LINKAGE , 1993 .

[10]  Silvana Castano,et al.  Multimedia Interpretation for Dynamic Ontology Evolution , 2009, J. Log. Comput..

[11]  Paolo Bouquet,et al.  OkkaM: Towards a Solution to the "Identity Crisis" on the Semantic Web , 2006, SWAP.

[12]  Silvana Castano,et al.  Matching Ontologies in Open Networked Systems: Techniques and Applications , 2006, J. Data Semant..

[13]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.