Detection of Relation Assertion Errors in Knowledge Graphs

Although the link prediction problem, where missing relation assertions are predicted, has been widely researched, error detection did not receive as much attention. In this paper, we investigate the problem of error detection in relation assertions of knowledge graphs, and we propose an error detection method which relies on path and type features used by a classifier for every relation in the graph exploiting local feature selection. We perform an extensive evaluation on a variety of datasets, backed by a manual evaluation on DBpedia and NELL, and we propose and evaluate heuristics for the selection of relevant graph paths to be used as features in our method.

[1]  Yu Hao,et al.  TransG : A Generative Mixture Model for Knowledge Graph Embedding , 2015, ArXiv.

[2]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Jianfeng Gao,et al.  Learning Multi-Relational Semantics Using Neural-Embedding Models , 2014, ArXiv.

[5]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[6]  Danqi Chen,et al.  Observed versus latent features for knowledge base and text inference , 2015, CVSC.

[7]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[8]  Tom M. Mitchell,et al.  Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction , 2015, EMNLP.

[9]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[10]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[11]  Gregory R. Crane,et al.  Quantifying the accuracy of relational statements in Wikipedia: a methodology , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Tim Weninger,et al.  ProjE: Embedding Projection for Knowledge Graph Completion , 2016, AAAI.

[14]  Heiko Paulheim,et al.  Local and global feature selection for multilabel classification with binary relevance , 2017, Artificial Intelligence Review.

[15]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[16]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[17]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[18]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[19]  Heiko Paulheim,et al.  Global RDF Vector Space Embeddings , 2017, SEMWEB.

[20]  Heiko Paulheim,et al.  Data-Driven Joint Debugging of the DBpedia Mappings and Ontology - Towards Addressing the Causes Instead of the Symptoms of Data Quality in DBpedia , 2017, ESWC.

[21]  Heiko Paulheim,et al.  Improving the Quality of Linked Data Using Statistical Distributions , 2014, Int. J. Semantic Web Inf. Syst..

[22]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[23]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[24]  Aldo Gangemi,et al.  Serving DBpedia with DOLCE - More than Just Adding a Cherry on Top , 2015, International Semantic Web Conference.

[25]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[26]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[27]  Christoph Lange,et al.  A Preliminary Investigation Towards Improving Linked Data Quality Using Distance-Based Outlier Detection , 2016, JIST.

[28]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[29]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[30]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[31]  Aoying Zhou,et al.  Error Link Detection and Correction in Wikipedia , 2016, CIKM.

[32]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[33]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[34]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[35]  Heiko Paulheim,et al.  One Knowledge Graph to Rule Them All? Analyzing the Differences Between DBpedia, YAGO, Wikidata & co , 2017, KI.

[36]  Rudolf Kadlec,et al.  Knowledge Base Completion: Baselines Strike Back , 2017, Rep4NLP@ACL.

[37]  Johanna Völker,et al.  Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification , 2016, WIMS.