Evaluation of instance matching tools: The experience of OAEI

Nowadays, the availability of large collections of data requires techniques and tools capable of linking data together, by retrieving potentially useful relations among them and helping in associating together data representing the same or similar real objects. One of the main problems in developing data linking techniques and tools is to understand the quality of the results produced by the matching process. In this paper, we describe the experience of instance matching and data linking evaluation in the context of the Ontology Alignment Evaluation Initiative ([email protected]). Our goal is to be able to validate different proposed methods, identify most promising techniques and directions for improvement, and, subsequently, guide further research in the area as well as development of robust tools for real-world tasks.

[1]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[2]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative , 2007 .

[3]  Enrico Motta,et al.  DSSim Results for OAEI 2008 , 2008, OM.

[4]  Robert Isele,et al.  Learning linkage rules using genetic programming , 2011, OM.

[5]  Lars Schmidt-Thieme,et al.  Object Identification with Constraints , 2006, Sixth International Conference on Data Mining (ICDM'06).

[6]  Craig A. Knoblock,et al.  Learning object identification rules for information integration , 2001, Inf. Syst..

[7]  Ivan P. Fellegi,et al.  A Theory for Record Linkage , 1969 .

[8]  Heiner Stuckenschmidt,et al.  Ontology Alignment Evaluation Initiative: Six Years of Experience , 2011, J. Data Semant..

[9]  Heiner Stuckenschmidt,et al.  Benchmarking Matching Applications on the Semantic Web , 2011, ESWC.

[10]  Jayant Madhavan,et al.  Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[11]  Cosmin Stroe,et al.  Using AgreementMaker to align ontologies for OAEI 2010 , 2010, OM.

[12]  Mikhail Bilenko and Raymond J. Mooney,et al.  On Evaluation and Training-Set Construction for Duplicate Detection , 2003 .

[13]  Andreas Thor,et al.  Comparative evaluation of entity resolution approaches with FEVER , 2009, Proc. VLDB Endow..

[14]  Haofen Wang,et al.  Zhishi.links results for OAEI 2011 , 2011, OM.

[15]  Raymond J. Mooney,et al.  Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.

[16]  Arjen P. de Vries,et al.  SERIMI results for OAEI 2011 , 2011, OM.

[17]  William E. Winkler,et al.  Methods for Record Linkage and Bayesian Networks , 2002 .

[18]  Mansur R. Kabuka,et al.  ASMOV: results for OAEI 2010 , 2010, OM.

[19]  Michel C. A. Klein,et al.  Change Management for Distributed Ontologies , 2004 .

[20]  Kei-Hoi Cheung,et al.  Linking Open Drug Data , 2009, I-SEMANTICS.

[21]  Mansur R. Kabuka,et al.  ASMOV Results for OAEI 2007 , 2007, OM.

[22]  Felix Naumann,et al.  Relationship-Based Duplicate Detection , 2006 .

[23]  Heiko Stoermer,et al.  Feature-Based Entity Matching: The FBEM Model, Implementation, Evaluation , 2010, CAiSE.

[24]  Silvana Castano,et al.  Mapping Validation by Probabilistic Reasoning , 2008, ESWC.

[25]  Jérôme David,et al.  The Alignment API 4.0 , 2011, Semantic Web.

[26]  Erhard Rahm,et al.  Frameworks for entity matching: A comparison , 2010, Data Knowl. Eng..

[27]  François Scharffe,et al.  Data Linking for the Semantic Web , 2011, Int. J. Semantic Web Inf. Syst..

[28]  Andreas Thor,et al.  MOMA - A Mapping-based Object Matching System , 2007, CIDR.

[29]  Pedro M. Domingos Multi-Relational Record Linkage , 2003 .

[30]  Min Wang,et al.  A declarative framework for semantic link discovery over relational data , 2009, WWW '09.

[31]  Enrico Motta,et al.  Refining Instance Coreferencing Results Using Belief Propagation , 2008, ASWC.

[32]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[33]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[34]  Mansur R. Kabuka,et al.  Ontology matching with semantic verification , 2009, J. Web Semant..

[35]  Yi Li,et al.  RiMOM: A Dynamic Multistrategy Ontology Alignment Framework , 2009, IEEE Transactions on Knowledge and Data Engineering.

[36]  Jan Nößner,et al.  CODI: Combinatorial Optimization for Data Integration: results for OAEI 2011 , 2010, OM.

[37]  William W. Cohen,et al.  Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.

[38]  C. Lee Giles,et al.  Autonomous citation matching , 1999, AGENTS '99.