SWSNL: Semantic Web Search Using Natural Language

As modern search engines are approaching the ability to deal with queries expressed in natural language, full support of natural language interfaces seems to be the next step in the development of future systems. The vision is that of users being able to tell a computer what they would like to find, using any number of sentences and as many details as requested. In this article we describe our effort to move towards this future using currently available technology. The Semantic Web framework was chosen as the best means to achieve this goal. We present our approach to building a complete Semantic Web Search Using Natural Language (SWSNL) system. We cover the complete process which includes preprocessing, semantic analysis, semantic interpretation, and executing a SPARQL query to retrieve the results. We perform an end-to-end evaluation on a domain dealing with accommodation options. The domain data come from an existing accommodation portal and we use a corpus of queries obtained by a Facebook campaign. In our paper we work with written texts in the Czech language. In addition to that, the Natural Language Understanding (NLU) module is evaluated on another domain (public transportation) and language (English). We expect that our findings will be valuable for the research community as they are strongly related to issues found in real-world scenarios. We struggled with inconsistencies in the actual Web data, with the performance of the Semantic Web engines on a decently sized knowledge base, and others.

[1]  Abraham Bernstein,et al.  How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users? , 2007, ISWC/ASWC.

[2]  Ning Zhong,et al.  SEMANTIC MAPPING FROM NATURAL LANGUAGE QUESTIONS TO OWL QUERIES , 2011, Comput. Intell..

[3]  Vladan Devedžić,et al.  Semantic Web and E-Tourism , 2009 .

[4]  Diego Mollá Aliod,et al.  Question Answering , 2010, Handbook of Natural Language Processing.

[5]  Clement T. Yu,et al.  Constructing Interface Schemas for Search Interfaces of Web Databases , 2005, WISE.

[6]  Fabio Ciravegna,et al.  Evaluating Semantic Search Query Approaches with Expert and Casual Users , 2012, SEMWEB.

[7]  Chong Wang,et al.  PANTO: A Portable Natural Language Interface to Ontologies , 2007, ESWC.

[8]  Steffen Staab,et al.  Towards the self-annotating web , 2004, WWW '04.

[9]  Elisabeth Métais,et al.  Natural language interfaces : what's the problem? -a data-driven quantitative analysis , 2010 .

[10]  Jean-Gabriel Ganascia,et al.  Next Generation Search Engines: Advanced Models for Information Retrieval , 2012 .

[11]  Peter Fankhauser,et al.  DivQ: diversification for keyword search over structured databases , 2010, SIGIR.

[12]  Orkunt Sabuncu,et al.  An ontology-based retrieval system using semantic indexing , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[13]  Kalina Bontcheva,et al.  A Text-based Query Interface to OWL Ontologies , 2008, LREC.

[14]  Ben Shneiderman,et al.  From Keyword Search to Exploration: Designing Future Search Interfaces for the Web , 2010, Found. Trends Web Sci..

[15]  Atanas Kiryakov,et al.  Semantic Annotation, Indexing, and Retrieval , 2003, SEMWEB.

[16]  Kalina Bontcheva,et al.  Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases , 2009, Web 2.0 & Semantic Web.

[17]  Hamish Cunningham,et al.  Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction , 2010, ESWC.

[18]  Michal Konkol,et al.  Maximum Entropy Named Entity Recognition for Czech Language , 2011, TSD.

[19]  Kyong-Ho Lee,et al.  Constructing composite web services from natural language requests , 2010, J. Web Semant..

[20]  Ivan A. Sag,et al.  Information-Based Syntax and Semantics: Volume 1, Fundamentals , 1987 .

[21]  Philipp Cimiano,et al.  Towards portable natural language interfaces to knowledge bases - The case of the ORAKEL system , 2008, Data Knowl. Eng..

[22]  Diana Maynard,et al.  Evaluating Evaluation Metrics for Ontology-Based Applications: Infinite Reflection , 2008, LREC.

[23]  James Scicluna,et al.  Modelling e-Tourism Services and Bundles , 2011, ENTER.

[24]  Kalina Bontcheva,et al.  A Natural Language Query Interface to Structured Information , 2008, ESWC.

[25]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[26]  Maria Teresa Pazienza,et al.  Semantic turkey: a browser-integrated environment for knowledge acquisition and management , 2012 .

[27]  Raymond J. Mooney,et al.  Using Multiple Clause Constructors in Inductive Logic Programming for Semantic Parsing , 2001, ECML.

[28]  Bipin C. Desai,et al.  Using semantic templates for a natural language interface to the CINDI virtual library , 2005, Data Knowl. Eng..

[29]  Enrico Motta,et al.  Integration of micro-gravity and geodetic data to constrain shallow system mass changes at Krafla Volcano, N Iceland , 2006 .

[30]  Dan Flickinger,et al.  Minimal Recursion Semantics: An Introduction , 2005 .

[31]  Holger Knublauch,et al.  Ontology-Driven Software Development in the Context of the Semantic Web: An Example Scenario with Protégé/OWL , 2004 .

[32]  Jacques Savoy,et al.  Indexing and stemming approaches for the Czech language , 2009, Inf. Process. Manag..

[33]  Abraham Bernstein,et al.  Evaluating Semantic Search Systems to Identify Future Directions of Research , 2012, ESWC.

[34]  Enrico Motta,et al.  AquaLog: An ontology-driven question answering system for organizational semantic intranets , 2007, J. Web Semant..

[35]  Alexandros Potamianos,et al.  A soft-clustering algorithm for automatic induction of semantic classes , 2007, INTERSPEECH.

[36]  Miloslav Konopík,et al.  Active Tags for Semantic Analysis , 2008, TSD.

[37]  Ivan A. Sag,et al.  Information-based syntax and semantics , 1987 .

[38]  Giuseppe Riccardi,et al.  Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[39]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[40]  Miloslav Konopík,et al.  Semantic Annotation for the LingvoSemantics Project , 2009, TSD.

[41]  Miloslav Konopík,et al.  On the Way towards Standardized Semantic Corpora for Development of Semantic Analysis Systems , 2010 .

[42]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[43]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[44]  Thomas Lukasiewicz,et al.  Semantic search on the Web , 2010, Semantic Web.

[45]  Steve J. Young,et al.  Semantic processing using the Hidden Vector State model , 2005, Comput. Speech Lang..

[46]  Berthold Crysmann,et al.  Question answering from structured knowledge sources , 2007, J. Appl. Log..

[47]  James Scicluna,et al.  Service Bundling with seekda! Dynamic Shop , 2010, ENTER.

[48]  José Luis Vicedo González,et al.  Addressing ontology-based question answering with collections of user queries , 2009, Inf. Process. Manag..

[49]  Kalina Bontcheva,et al.  Text Processing with GATE , 2011 .

[50]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[51]  Wolfgang Nejdl,et al.  From keywords to semantic queries - Incremental query construction on the semantic web , 2009, J. Web Semant..

[52]  Miloslav Konopík,et al.  Hybrid Semantic Analysis , 2009, TSD.

[53]  Marie Mikulová,et al.  Prague Dependency Treebank 2.0 (PDT 2.0) , 2006 .

[54]  Gary Geunbae Lee,et al.  Practical use of non-local features for statistical spoken language understanding , 2008, Comput. Speech Lang..

[55]  Kalina Bontcheva,et al.  CA manager framework: creating customised workflows for ontology population and semantic annotation , 2009, K-CAP '09.

[56]  Rafael Valencia-García,et al.  Accessing Touristic Knowledge Bases through a Natural Language Interface , 2008, PKAW.

[57]  Atanas Kiryakov,et al.  Semantic annotation, indexing, and retrieval , 2004, J. Web Semant..

[58]  Philipp Cimiano,et al.  Natural Language Interfaces: What Is the Problem? - A Data-Driven Quantitative Analysis , 2009, NLDB.

[59]  Donghee Yoo,et al.  Hybrid query processing for personalized information retrieval on the Semantic Web , 2012, Knowl. Based Syst..

[60]  William E. Winkler,et al.  String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. , 1990 .

[61]  Khalid Choukri,et al.  The european language resources association , 1998, LREC.