Learning Triple Sequence Patterns in Knowledge Graphs to Predict Inconsistencies

The current trend towards the Semantic Web and Linked Data has resulted in an unprecedented volume of data being continuously published on the Linked Open Data (LOD) cloud. Massive Knowledge Graphs (KGs) are increasingly constructed and enriched based on large amounts of unstructured data. However, the data quality of KGs can still suffer from a variety of inconsistencies, misinterpretations or incomplete information as well. This study investigates the feasibility of utilising the subject-predicate-object (SPO) structure of KG triples to detect possible inconsistencies. The key idea is hinged on using the Freebase-defined entity types for extracting the unique SPO patterns in the KG. Using Machine learning, the problem of predicting inconsistencies could be approached as a sequence classification task. The approach applicability was experimented using a subset of the Freebase KG, which included about 6M triples. The experiments proved promising results using Convnet and LSTM models for detecting inconsistent sequences.