Interactive reasoning in uncertain RDF knowledge bases

Recent advances in Web-based information extraction have allowed for the automatic construction of large, semantic knowledge bases, which are typically captured in RDF format. The very nature of the applied extraction techniques however entails that the resulting RDF knowledge bases may face a significant amount of incorrect, incomplete, or even inconsistent (i.e., uncertain) factual knowledge, which makes query answering over this kind of data a challenge. Our reasoner, coined URDF, supports SPARQL queries along with rule-based, first-order predicate logic to infer new facts and to resolve data uncertainty over millions of RDF triplets directly at query time. We demonstrate a fully interactive reasoning engine, combining a Java-based reasoning backend and a Flash-based visualization frontend in a dynamic client-server architecture. Our visualization frontend provides interactive access to the reasoning backend, including tasks like exploring the knowledge base, rule-based and statistical reasoning, faceted browsing of large query graphs, and explaining answers through lineage.

[1]  Jennifer Widom,et al.  Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Lise Getoor,et al.  PrDB: managing and exploiting rich correlations in probabilistic databases , 2009, The VLDB Journal.

[3]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[4]  Serge Abiteboul,et al.  On the representation and querying of sets of possible worlds , 1987, SIGMOD '87.

[5]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[6]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[7]  Prithviraj Sen,et al.  Representing and Querying Correlated Tuples in Probabilistic Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  Andrew McCallum,et al.  Scalable probabilistic databases with factor graphs and MCMC , 2010, Proc. VLDB Endow..

[9]  Pedro M. Domingos,et al.  A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC , 2008, AAAI.

[10]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[11]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[12]  Jennifer Widom,et al.  ULDBs: databases with uncertainty and lineage , 2006, VLDB.

[13]  Daisy Zhe Wang,et al.  BayesStore: managing large, uncertain data repositories with probabilistic graphical models , 2008, Proc. VLDB Endow..

[14]  Fabian M. Suchanek,et al.  URDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules , 2010 .

[15]  Dan Olteanu,et al.  MayBMS: Managing Incomplete Information with Probabilistic World-Set Decompositions , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Christopher Ré,et al.  MYSTIQ: a system for finding more answers by using probabilities , 2005, SIGMOD '05.

[17]  Raghu Ramakrishnan,et al.  Optimizing mpf queries: decision support and probabilistic inference , 2007, SIGMOD '07.

[18]  Susanne E. Hambrusch,et al.  Database Support for Probabilistic Attributes and Tuples , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[19]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[20]  Christopher Ré,et al.  Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS , 2011, Proc. VLDB Endow..

[21]  Gerhard Weikum,et al.  YAGO2: exploring and querying world knowledge in time, space, context, and many languages , 2011, WWW.

[22]  Amol Deshpande,et al.  Lineage processing over correlated probabilistic databases , 2010, SIGMOD Conference.