Leopard - A baseline approach to attribute prediction and validation for knowledge graph population

Abstract In this paper, we report on the participation of Leopard to the Semantic Web Challenge at the 16th International Semantic Web Conference. Leopard is a baseline approach to predict and validate attributes for knowledge graph population. The approach was designed as a baseline for the challenge. It combines diverse text extraction methods with a simple precision ranking and utilizes sources from the multilingual Document Web as well as from the multilingual Data Web. Despite being designed to be a baseline, Leopard achieved the second-best score in both challenge tasks (53.42% F1-Score and 53.09% AUC) behind IBM’s system Socrates (55.40% F1-Score and 68.01% AUC). Our approach is open source and can be found at https://github.com/dice-group/Leopard .

[1]  Axel-Cyrille Ngonga Ngomo,et al.  Ensemble Learning for Named Entity Recognition , 2014, SEMWEB.

[2]  Alfio Gliozzo,et al.  A Dataset for Web-Scale Knowledge Base Population , 2018, ESWC.

[3]  Christopher De Sa,et al.  Incremental Knowledge Base Construction Using DeepDive , 2015, Proceedings of the VLDB Endowment International Conference on Very Large Data Bases.

[4]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[5]  Tim Furche,et al.  OXPath: A language for scalable data extraction, automation, and crawling on the deep web , 2012, The VLDB Journal.

[6]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[7]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[8]  Alfio Gliozzo,et al.  Inducing Implicit Relations from Text Using Distantly Supervised Deep Nets , 2018, International Semantic Web Conference.

[9]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[10]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[11]  Andrew McCallum,et al.  Building Knowledge Bases with Universal Schema: Cold Start and Slot-Filling Approaches , 2015, TAC.

[12]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.