Bayesian Knowledge Corroboration with Logical Rules and User Feedback

Current knowledge bases suffer from either low coverage or low accuracy. The underlying hypothesis of this work is that user feedback can greatly improve the quality of automatically extracted knowledge bases. The feedback could help quantify the uncertainty associated with the stored statements and would enable mechanisms for searching, ranking and reasoning at entity-relationship level. Most importantly, a principled model for exploiting user feedback to learn the truth values of statements in the knowledge base would be a major step forward in addressing the issue of knowledge base curation. We present a family of probabilistic graphical models that builds on user feedback and logical inference rules derived from the popular Semantic-Web formalism of RDFS [1]. Through internal inference and belief propagation, these models can learn both, the truth values of the statements in the knowledge base and the reliabilities of the users who give feedback. We demonstrate the viability of our approach in extensive experiments on real-world datasets, with feedback collected from Amazon Mechanical Turk.

[1]  Dan Olteanu,et al.  10106 Worlds and Beyond: Efficient Representation and Processing of Incomplete Information , 2007, ICDE.

[2]  Herman J. ter Horst,et al.  Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary , 2005, J. Web Semant..

[3]  Jens Lehmann,et al.  Discovering Unknown Connections - the DBpedia Relationship Finder , 2007, CSSW.

[4]  Ofer Meshi,et al.  Template Based Inference in Symmetric Relational Markov Random Fields , 2007, UAI.

[5]  Gerhard Weikum,et al.  Active knowledge: dynamically enriching RDF knowledge bases by web services , 2010, SIGMOD Conference.

[6]  C. Koch,et al.  Worlds and Beyond : Effcient Representation and Processing of Incomplete Information , 2007 .

[7]  D. Y. Chechelnytskyy,et al.  Wolfram Alpha: computational knowledge engine , 2012 .

[8]  Daniel N. Osherson,et al.  Aggregating disparate estimates of chance , 2006, Games Econ. Behav..

[9]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[10]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[11]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[12]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[13]  Gerhard Weikum,et al.  SOFIE: a self-organizing framework for information extraction , 2009, WWW '09.

[14]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[15]  Lise Getoor,et al.  PrDB: managing and exploiting rich correlations in probabilistic databases , 2009, The VLDB Journal.

[16]  Christopher Ré,et al.  Probabilistic databases: diamonds in the dirt , 2009, CACM.

[17]  Paulo Cesar G. da Costa,et al.  A First-Order Bayesian Tool for Probabilistic Ontologies , 2008, FLAIRS Conference.

[18]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[19]  James Fogarty,et al.  Intelligence in Wikipedia , 2008, AAAI.

[20]  David Poole,et al.  First-order probabilistic inference , 2003, IJCAI.

[21]  Gerhard Weikum,et al.  STAR: Steiner-Tree Approximation in Relationship Graphs , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[22]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[23]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[24]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[25]  Parag Agrawal,et al.  Trio: a system for data, uncertainty, and lineage , 2006, VLDB.

[26]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[27]  Gerhard Weikum,et al.  NAGA: Searching and Ranking Knowledge , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[28]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[29]  Ronald J. Brachman,et al.  An overview of the KL-ONE Knowledge Representation System , 1985 .

[30]  Serge Abiteboul,et al.  Corroborating information from disagreeing views , 2010, WSDM '10.

[31]  Pedro M. Domingos,et al.  Lifted First-Order Belief Propagation , 2008, AAAI.

[32]  Gerhard Weikum,et al.  MING: mining informative entity relationship subgraphs , 2009, CIKM.

[33]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[34]  Lise Getoor Tutorial on Statistical Relational Learning , 2005, ILP.

[35]  Audun Jøsang,et al.  Exploring Different Types of Trust Propagation , 2006, iTrust.

[36]  Jaime Teevan,et al.  Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.