Expressive power of entity-linking frameworks

Abstract We develop a unifying approach to declarative entity linking by introducing the notion of an entity-linking framework and an accompanying notion of the certain links in such a framework. In an entity-linking framework, logic-based constraints are used to express properties of the desired link relations in terms of source relations and, possibly, in terms of other link relations. The definition of the certain links in such a framework makes use of weighted repairs and consistent answers in inconsistent databases. We demonstrate the modeling capabilities of this approach by showing that numerous concrete entity-linking scenarios can be cast as such entity-linking frameworks for suitable choices of constraints and weights. By using the certain links as a measure of expressive power, we investigate the relative expressive power of several entity-linking frameworks and obtain sharp comparisons.

[1]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[2]  H. Gaifman On Local and Non-Local Properties , 1982 .

[3]  Panagiotis G. Ipeirotis,et al.  Duplicate Record Detection: A Survey , 2007 .

[4]  Stephen H. Bach,et al.  Hinge-Loss Markov Random Fields and Probabilistic Soft Logic , 2015, J. Mach. Learn. Res..

[5]  Divesh Srivastava,et al.  Record linkage: similarity measures and algorithms , 2006, SIGMOD Conference.

[6]  Rajasekar Krishnamurthy,et al.  HIL: a high-level scripting language for entity integration , 2013, EDBT '13.

[7]  Dennis Shasha,et al.  Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.

[8]  Leopoldo E. Bertossi,et al.  Complexity of Consistent Query Answering in Databases Under Cardinality-Based and Incremental Repair Semantics , 2006, ICDT.

[9]  Wenfei Fan,et al.  Dependencies revisited for improving data quality , 2008, PODS.

[10]  Jianfeng Du,et al.  Weight-based consistent query answering over inconsistent $${\mathcal {SHIQ}}$$ knowledge bases , 2012, Knowledge and Information Systems.

[11]  Andreas Thor,et al.  Evaluation of entity resolution approaches on real-world match problems , 2010, Proc. VLDB Endow..

[12]  Ivan P. Fellegi,et al.  A Theory for Record Linkage , 1969 .

[13]  Ronald Fagin,et al.  A Declarative Framework for Linking Entities , 2016, ACM Trans. Database Syst..

[14]  Salvatore J. Stolfo,et al.  The merge/purge problem for large databases , 1995, SIGMOD '95.

[15]  Erhard Rahm,et al.  Frameworks for entity matching: A comparison , 2010, Data Knowl. Eng..

[16]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[17]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[18]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[19]  Jan Chomicki,et al.  Prioritized repairing and consistent query answering in relational databases , 2012, Annals of Mathematics and Artificial Intelligence.

[20]  Leonid Libkin,et al.  Logics with counting and local properties , 2000, TOCL.

[21]  Ronald Fagin,et al.  Expressive Power of Entity-Linking Frameworks , 2019, ICDT.

[22]  Christopher Ré,et al.  Large-Scale Deduplication with Constraints Using Dedupalog , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[23]  Jan Chomicki,et al.  Minimal-change integrity maintenance using tuple deletions , 2002, Inf. Comput..

[24]  Jayant Madhavan,et al.  Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[25]  Lise Getoor,et al.  Probabilistic Similarity Logic , 2010, UAI.

[26]  Leopoldo Bertossi,et al.  ERBlox: Combining matching dependencies with machine learning for entity resolution , 2017, Int. J. Approx. Reason..

[27]  Laks V. S. Lakshmanan,et al.  Data Cleaning and Query Answering with Matching Dependencies and Matching Functions , 2010, ICDT '11.