Treo: Combining Entity-Search, Spreading Activation and Semantic Relatedness for Querying Linked Data

This paper describes Treo, a natural language query mechanism for Linked Data which focuses on the provision of a precise and scalable semantic matching approach between natural language queries and distributed heterogeneous Linked Datasets. Treo’s semantic matching approach combines three key elements: entity search, a Wikipedia-based semantic relatedness measure and spreading activation search. While entity search allows Treo to cope with queries over high volume and distributed data, the combination of entity search and spreading activation search using a Wikipedia-based semantic relatedness measure provides a flexible approach for handling the semantic match between natural language queries and Linked Data. Experimental results using the DBPedia QALD training query set showed that this combination represents a promising line of investigation, achieving a mean reciprocal rank of 0.489, precision of 0.395 and recall of 0.451.

[1]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[2]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[3]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[4]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[5]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Abraham Bernstein,et al.  Querying the Semantic Web with Ginseng: A Guided Input Natural Language Search Engine , 2009 .

[8]  Jürgen Umbrich,et al.  Data summaries for on-demand queries over linked data , 2010, WWW '10.

[9]  Abraham Bernstein,et al.  Querix: A Natural Language Interface to Query Ontologies Based on Clarification Dialogs , 2006 .

[10]  Enrico Motta,et al.  PowerMap: Mapping the Real Semantic Web on the Fly , 2006, SEMWEB.

[11]  Philipp Cimiano,et al.  Towards portable natural language interfaces to knowledge bases - The case of the ORAKEL system , 2008, Data Knowl. Eng..

[12]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[13]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[14]  Fabio Crestani,et al.  Application of Spreading Activation Techniques in Information Retrieval , 1997, Artificial Intelligence Review.

[15]  A. Bernstein,et al.  Ginseng : A Guided Input Natural Language Search Engine for Querying Ontologies , 2006 .

[16]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[17]  H. R. Quillian In semantic information processing , 1968 .

[18]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[19]  Giovanni Tummarello,et al.  A Node Indexing Scheme for Web Entity Retrieval , 2010, ESWC.

[20]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[21]  Enrico Motta,et al.  PowerAqua: Fishing the Semantic Web , 2006, ESWC.

[22]  Paul R. Cohen,et al.  Information retrieval by constrained spreading activation in semantic networks , 1987, Inf. Process. Manag..

[23]  Abraham Bernstein,et al.  Evaluating the usability of natural language query languages and interfaces to Semantic Web knowledge bases , 2010, J. Web Semant..

[24]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[25]  Alon Y. Halevy,et al.  Indexing dataspaces , 2007, SIGMOD '07.