XML Retrieval with a Natural Language Interface

Effective information retrieval in XML documents requires the user to have good knowledge of document structure and of some formal query language. XML query languages like XPath and XQuery are too complex to be considered for use by end users. We present an approach to XML query processing that supports the specification of both textual and structural constraints in natural language. We implemented a system that supports the evaluation of both formal XPath-like queries and natural language XML queries. We present comparative test results that were performed with the INEX 2004 topics and XML collection. Our results quantify the trade-off in performance of natural language XML queries vs formal queries with favourable results.

[1]  Alan F. Smeaton Information Retrieval: Still Butting Heads with Natural Language Processing? , 1997, SCIE.

[2]  Karen Sparck Jones What is the Role of NLP in Text Retrieval , 1999 .

[3]  Shlomo Geva GPX - Gardens Point XML Information Retrieval at INEX 2004 , 2004, INEX.

[4]  Mihaela Juganaru-Mathieu,et al.  Analysing Natural Language Queries at INEX 2004 , 2004, INEX.

[5]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[6]  Peter Thanisch,et al.  Natural language interfaces to databases – an introduction , 1995, Natural Language Engineering.

[7]  Maria Teresa Pazienza,et al.  Information Extraction A Multidisciplinary Approach to an Emerging Information Technology , 1997, Lecture Notes in Computer Science.

[8]  Alan F. Smeaton,et al.  Using NLP or NLP Resources for Information Retrieval Tasks , 1999 .

[9]  Tomek Strzalkowski Natural Language Information Retrieval , 1995, Inf. Process. Manag..

[10]  Andrew Trotman,et al.  Narrowed Extended XPath I (NEXI) , 2004, INEX.

[11]  Mounia Lalmas,et al.  Advances in XML Information Retrieval, Third International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2004, Dagstuhl Castle, Germany, December 6-8, 2004, Revised Selected Papers , 2005, INEX.

[12]  Avi Arampatzis,et al.  Linguistically Motivated Information Retrieval , 2000 .

[13]  C. Raymond Perrault,et al.  Natural-language interfaces , 1986 .

[14]  Carolyn M. Hall,et al.  Encyclopedia of Library and Information Science , 1971 .

[15]  Karen Spärck Jones,et al.  Natural language interfaces to databases , 1990, The Knowledge Engineering Review.