Knowledge Graph Validation

Knowledge graphs (KGs) have shown to be an important asset of large companies like Google and Microsoft. KGs play an important role in providing structured and semantically rich information, making them available to people and machines, and supplying accurate, correct and reliable knowledge. To do so a critical task is knowledge validation, which measures whether statements from KGs are semantically correct and correspond to the so-called "real" world. In this paper, we provide an overview and review of the state-of-the-art approaches, methods and tools on knowledge validation for KGs, as well as an evaluation of them. As a result, we demonstrate a lack of reproducibility of tools results, give insights, and state our future research direction.

[1]  Ryutaro Ichise,et al.  Resolving Range Violations in DBpedia , 2017, JIST.

[2]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[3]  Ankur Padia,et al.  SURFACE: Semantically Rich Fact Validation with Explanations , 2018, ArXiv.

[4]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[5]  Christoph Lange,et al.  A Preliminary Investigation Towards Improving Linked Data Quality Using Distance-Based Outlier Detection , 2016, JIST.

[6]  Maosong Sun,et al.  Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning with Confidence , 2017, AAAI.

[7]  Natasha Noy,et al.  Industry-scale Knowledge Graphs: Lessons and Challenges , 2019, ACM Queue.

[8]  Gerhard Weikum,et al.  Tracy: Tracing Facts over Knowledge Graphs and Text , 2019, WWW.

[9]  Frank van Harmelen,et al.  Detecting Erroneous Identity Links on the Web Using Network Metrics , 2018, SEMWEB.

[10]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[11]  Axel-Cyrille Ngonga Ngomo,et al.  COPAAL - An Interface for Explaining Facts using Corroborative Paths , 2019, ISWC Satellites.

[12]  Axel-Cyrille Ngonga Ngomo,et al.  Unsupervised Discovery of Corroborative Paths for Fact Validation , 2019, SEMWEB.

[13]  Jens Lehmann,et al.  DeFacto - Temporal and multilingual Deep Fact Validation , 2015, J. Web Semant..

[14]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[15]  Jens Lehmann,et al.  Toward Veracity Assessment in RDF Knowledge Bases , 2018, ACM J. Data Inf. Qual..

[16]  Heiko Paulheim,et al.  Detecting Incorrect Numerical Data in DBpedia , 2014, ESWC.

[17]  Seung-won Hwang,et al.  Graph-Based Wrong IsA Relation Detection in a Large-Scale Lexical Taxonomy , 2017, AAAI.

[18]  Filippo Menczer,et al.  Finding Streams in Knowledge Graphs to Support Fact Checking , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[19]  Mark Stevenson,et al.  Evaluating Topic Coherence Using Distributional Semantics , 2013, IWCS.

[20]  Xiaojun Chen,et al.  Triple Trustworthiness Measurement for Knowledge Graph , 2018, WWW.

[21]  Achim Rettinger,et al.  Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO , 2017, Semantic Web.

[22]  Dieter Fensel,et al.  Duplication Detection in Knowledge Graphs: Literature and Tools , 2020, ArXiv.

[23]  Andreas Vlachos,et al.  An Extensible Framework for Verification of Numerical Claims , 2017, EACL.

[24]  Paolo Papotti,et al.  A Benchmark for Fact Checking Algorithms Built on Knowledge Bases , 2019, CIKM.

[25]  Axel-Cyrille Ngonga Ngomo,et al.  FactCheck: Validating RDF Triples Using Textual Evidence , 2018, CIKM.

[26]  Heiko Paulheim,et al.  Improving the Quality of Linked Data Using Statistical Distributions , 2014, Int. J. Semantic Web Inf. Syst..

[27]  Tim Weninger,et al.  Discriminative predicate path mining for fact checking in knowledge graphs , 2015, Knowl. Based Syst..

[28]  Johannes Fürnkranz,et al.  Unsupervised generation of data mining features from linked open data , 2012, WIMS '12.

[29]  Andrea Giovanni Nuzzolese,et al.  Automatic Typing of DBpedia Entities , 2012, SEMWEB.

[30]  Katja Hose,et al.  Retrieving Textual Evidence for Knowledge Graph Facts , 2019, ESWC.

[31]  Heiko Paulheim,et al.  Detection of Relation Assertion Errors in Knowledge Graphs , 2017, K-CAP.

[32]  Axel-Cyrille Ngonga Ngomo,et al.  Extracting Multilingual Natural-Language Patterns for RDF Predicates , 2012, EKAW.

[33]  Stefan Decker,et al.  Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371) , 2019, Dagstuhl Reports.

[34]  Satoshi Nakamura,et al.  Trustworthiness Analysis of Web Search Results , 2007, ECDL.

[35]  Steffen Staab,et al.  Knowledge graphs , 2020, Commun. ACM.

[36]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[37]  Jürgen Umbrich,et al.  Knowledge Graphs: Methodology, Tools and Selected Use Cases , 2020 .

[38]  Katja Hose,et al.  ROXXI: Reviving witness dOcuments to eXplore eXtracted Information , 2010, Proc. VLDB Endow..

[39]  Jens Lehmann,et al.  TISCO: Temporal scoping of facts , 2019, J. Web Semant..

[40]  Axel-Cyrille Ngonga Ngomo,et al.  Leopard - A baseline approach to attribute prediction and validation for knowledge graph population , 2019, J. Web Semant..

[41]  Andrea Giovanni Nuzzolese,et al.  Type inference through the analysis of Wikipedia links , 2012, LDOW.

[42]  Gerhard Weikum,et al.  ExFaKT: A Framework for Explaining Facts over Knowledge Graphs and Text , 2019, WSDM.

[43]  Axel-Cyrille Ngonga Ngomo,et al.  Named Entity Recognition using FOX , 2014, International Semantic Web Conference.

[44]  Timothy W. Finin,et al.  Type Prediction for Efficient Coreference Resolution in Heterogeneous Semantic Graphs , 2013, 2013 IEEE Seventh International Conference on Semantic Computing.

[45]  Jens Lehmann,et al.  DeFacto - Deep Fact Validation , 2012, SEMWEB.

[46]  Katja Hose,et al.  S3K: seeking statement-supporting top-K witnesses , 2011, CIKM '11.