Triple Scoring Using a Hybrid Fact Validation Approach - The Catsear Triple Scorer at WSDM Cup 2017

With the continuous increase of data daily published in knowledge bases across the Web, one of the main issues is regarding information relevance. In most knowledge bases, a triple (i.e., a statement composed by subject, predicate, and object) can be only true or false. However, triples can be assigned a score to have information sorted by relevance. In this work, we describe the participation of the Catsear team in the Triple Scoring Challenge at the WSDM Cup 2017. The Catsear approach scores triples by combining the answers coming from three different sources using a linear regression classifier. We show how our approach achieved an Accuracy2 value of 79.58% and the overall 4th place.

[1]  E. Nadaraya On Estimating Regression , 1964 .

[2]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[3]  Benno Stein,et al.  Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.

[4]  Hannah Bast,et al.  Relevance Scores for Triples from Type-Like Relations , 2015, SIGIR.

[5]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[6]  Stefan Heindorf,et al.  WSDM Cup 2017: Vandalism Detection and Triple Scoring , 2017, WSDM.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Jens Lehmann,et al.  DeFacto - Temporal and multilingual Deep Fact Validation , 2015, J. Web Semant..

[9]  Jens Lehmann,et al.  Exploring Term Networks for Semantic Search over RDF Knowledge Graphs , 2016, MTSR.

[10]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[11]  Vladimir Kuznetsov On Triplet Classification of Concepts , 1997 .

[12]  Sören Auer,et al.  KBox — Transparently Shifting Query Execution on Knowledge Graphs to the Edge , 2017, 2017 IEEE 11th International Conference on Semantic Computing (ICSC).

[13]  Yang Xiang,et al.  Knowledge Graph Embedding for Link Prediction and Triplet Classification , 2016, CCKS.

[14]  Hannah Bast,et al.  Overview of the Triple Scoring Task at the WSDM Cup 2017 , 2017, ArXiv.