An Evolutionary Algorithm to Learn SPARQL Queries for Source-Target-Pairs - Finding Patterns for Human Associations in DBpedia

Efficient usage of the knowledge provided by the Linked Data community is often hindered by the need for domain experts to formulate the right SPARQL queries to answer questions. For new questions they have to decide which datasets are suitable and in which terminology and modelling style to phrase the SPARQL query. In this work we present an evolutionary algorithm to help with this challenging task. Given a training list of source-target node-pair examples our algorithm can learn patterns SPARQL queries from a SPARQL endpoint. The learned patterns can be visualised to form the basis for further investigation, or they can be used to predict target nodes for new source nodes. Amongst others, we apply our algorithm to a dataset of several hundred human associations such as "circle - square" to find patterns for them in DBpedia. We show the scalability of the algorithm by running it against a SPARQL endpoint loaded with $$> 7.9$$>7.9 billion triples. Further, we use the resulting SPARQL queries to mimic human associations with a Mean Average Precision MAP of $$39.9\,\%$$39.9% and a Recall@10 of $$63.9\,\%$$63.9%.

[1]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[2]  Achim Rettinger,et al.  PageRank on Wikipedia: Towards General Importance Scores for Entities , 2016, @ESWC.

[3]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[4]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[5]  James P. McCusker,et al.  WebSig: a digital signature framework for the web , 2015 .

[6]  Andreas Dengel,et al.  BetterRelations: Using a Game to Rate Linked Data Triples , 2011, KI.

[7]  Andreas Dengel,et al.  Erratum to: Edinburgh Associative Thesaurus as RDF and DBpedia Mapping , 2016, ESWC.

[8]  Andreas Dengel,et al.  Linked Data Games: Simulating Human Association with Linked Data , 2010, LWA.

[9]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[10]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[11]  Andreas Dengel,et al.  BetterRelations: Collecting Association Strengths for Linked Data Triples with a Game , 2012, SeCO Book.

[12]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[13]  Slim Abdennadher,et al.  Collecting Links between Entities Ranked by Human Association Strengths , 2013, ESWC.

[14]  Agnieszka Lawrynowicz,et al.  Pattern Based Feature Construction in Semantic Data Mining , 2014, Int. J. Semantic Web Inf. Syst..

[15]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[16]  Jens Lehmann,et al.  RelFinder: Revealing Relationships in RDF Knowledge Bases , 2009, SAMT.

[17]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[18]  Jens Lehmann,et al.  AutoSPARQL: Let Users Query Your Knowledge Base , 2011, ESWC.

[19]  Steffen Lohmann,et al.  Interactive Relationship Discovery via the Semantic Web , 2010, ESWC.

[20]  Andreas Dengel,et al.  Edinburgh Associative Thesaurus as RDF and DBpedia Mapping , 2016, ESWC.