Sorry, i don't speak SPARQL: translating SPARQL queries into natural language

Over the past years, Semantic Web and Linked Data technologies have reached the backend of a considerable number of applications. Consequently, large amounts of RDF data are constantly being made available across the planet. While experts can easily gather information from this wealth of data by using the W3C standard query language SPARQL, most lay users lack the expertise necessary to proficiently interact with these applications. Consequently, non-expert users usually have to rely on forms, query builders, question answering or keyword search tools to access RDF data. However, these tools have so far been unable to explicate the queries they generate to lay users, making it difficult for these users to i) assess the correctness of the query generated out of their input, and ii) to adapt their queries or iii) to choose in an informed manner between possible interpretations of their input. This paper addresses this drawback by presenting SPARQL2NL, a generic approach that allows verbalizing SPARQL queries, i.e., converting them into natural language. Our framework can be integrated into applications where lay users are required to understand SPARQL or to generate SPARQL queries in a direct (forms, query builders) or an indirect (keyword search, question answering) manner. We evaluate our approach on the DBpedia question set provided by QALD-2 within a survey setting with both SPARQL experts and lay users. The results of the 115 filled surveys show that SPARQL2NL can generate complete and easily understandable natural language descriptions. In addition, our results suggest that even SPARQL experts can process the natural language representation of SPARQL queries computed by our approach more efficiently than the corresponding SPARQL queries. Moreover, non-experts are enabled to reliably understand the content of SPARQL queries.

[1]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[2]  Norbert E. Fuchs First-Order Reasoning for Attempto Controlled English , 2010, CNL.

[3]  Axel-Cyrille Ngonga Ngomo,et al.  Extracting Multilingual Natural-Language Patterns for RDF Predicates , 2012, EKAW.

[4]  Robert Dale,et al.  Building Natural Language Generation Systems: Figures , 2000 .

[5]  Eduard H. Hovy,et al.  Aggregation in Natural Language Generation , 1993, EWNLG.

[6]  Ion Androutsopoulos,et al.  Generating Multilingual Descriptions from Linguistically Annotated OWL Ontologies: the NaturalOWL System , 2007, ENLG.

[7]  Aleksander Pohl The polish interface for linked open data , 2010 .

[8]  Kalina Bontcheva,et al.  Automatic Report Generation from Ontologies: The MIAKT Approach , 2004, NLDB.

[9]  Jens Lehmann,et al.  Template-based question answering over RDF data , 2012, WWW.

[10]  Michael Zock,et al.  Trends in Natural Language Generation An Artificial Intelligence Perspective , 1996, Lecture Notes in Computer Science.

[11]  Christian Kop,et al.  Guideline based evaluation and verbalization of OWL class and property labels , 2010, Data Knowl. Eng..

[12]  D. Gerber,et al.  Bootstrapping the Linked Data Web , 2011 .

[13]  Kristiina Jokinen,et al.  Generating Responses and Explanations from RDF/XML and DAML+OIL , 2003 .

[14]  Normunds Gruzitis,et al.  Verbalizing Ontologies in Controlled Baltic Languages , 2010, Baltic HLT.

[15]  Yannis E. Ioannidis From Databases to Natural Language: The Unusual Direction , 2008, NLDB.

[16]  Richard Power,et al.  OWL to English: a tool for generating organised easily-navigated hypertexts from ontologies , 2011 .

[17]  Antonio L. Furtado,et al.  Verbalization of RDF Triples with Applications , 2011 .

[18]  Georgia Koutrika,et al.  Explaining structured queries in natural language , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[19]  Elena Paslaru Bontas Simperl,et al.  Labels in the Web of Data , 2011, SEMWEB.

[20]  Sebastian Hellmann,et al.  Keyword-Driven SPARQL Query Generation Leveraging Background Knowledge , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[21]  Kalina Bontcheva,et al.  RoundTrip Ontology Authoring , 2008, SEMWEB.

[22]  Jens Lehmann,et al.  DeFacto - Deep Fact Validation , 2012, SEMWEB.

[23]  Kaarel Kaljurand,et al.  Verbalizing OWL in Attempto Controlled English , 2007, OWLED.

[24]  Henry A. Kautz,et al.  Towards a theory of natural language interfaces to databases , 2003, IUI '03.

[25]  Elena Paslaru Bontas Simperl,et al.  SPARTIQULATION: Verbalizing SPARQL Queries , 2012, ILD@ESWC.

[26]  Graham Wilcock Talking OWLs: Towards an Ontology Verbalizer , 2003 .

[27]  Asunción Gómez-Pérez,et al.  ONTOGENERATION: Reusing Domain and Linguistic Ontologies for Spanish Text Generation , 1998 .

[28]  Jens Lehmann,et al.  AutoSPARQL: Let Users Query Your Knowledge Base , 2011, ESWC.

[29]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[30]  Chris Mellish,et al.  The semantic web as a Linguistic resource: Opportunities for natural language generation , 2005, Knowl. Based Syst..

[31]  Chris Mellish,et al.  An Experiment on “Free Generation” from Single RDF Triples , 2007, ENLG.