Exploiting structures in keyword queries for effective XML search

Keyword search on XML documents has received considerable research interests recently. Most existing methods put their emphases on the document side, and focus on how to utilize structural properties of XML documents to produce better search results, more effective ranking methods, or more efficient algorithms. However, effective XML search requires a full understanding of not only XML documents but also XML keyword queries, whereas little attention has been paid to the latter. In this paper, we focus on the query side of XML keyword search instead of the document side. We show that keyword queries have structures, and define a concept called keyword query with structure (QWS) to capture query structure. As query structure provides hints about the intent of the query, it can be used to improve the quality of the search results. We exploit some key observations to characterize the structure in a keyword query and show how to refine search results with the assistance of query structure. In order to take the benefits of query structure, we design a query processing approach to produce results given a keyword query. It first derives some QWSs based on heuristics, and computes results of these queries, then expands the results if needed. We implement the proposed methods and conduct comprehensive experiments. Experimental results verify the effectiveness of our methods.

[1]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[2]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[3]  Walid G. Aref,et al.  Supporting views in data stream management systems , 2010, TODS.

[4]  Lei Chen,et al.  Returning Clustered Results for Keyword Search on XML Documents , 2011, IEEE Transactions on Knowledge and Data Engineering.

[5]  Marianne Winslett,et al.  Using structural information in XML keyword search effectively , 2011, TODS.

[6]  Ziyang Liu,et al.  Return specification inference and result clustering for keyword search on XML , 2010, TODS.

[7]  Jianxin Li,et al.  XClean: Providing valid spelling suggestions for XML keyword queries , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[8]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[9]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.

[10]  Yehoshua Sagiv,et al.  XSEarch: A Semantic Search Engine for XML , 2003, VLDB.

[11]  Andrew Trotman,et al.  Narrowed Extended XPath I (NEXI) , 2004, INEX.

[12]  K. Pu,et al.  Keyword query cleaning , 2008, Proc. VLDB Endow..

[13]  Yi Chen,et al.  Identifying meaningful return information for XML keyword search , 2007, SIGMOD '07.

[14]  Wei Wang,et al.  Keyword-based search and exploration on databases , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[15]  Marianne Winslett,et al.  Effective, design-independent XML keyword search , 2009, CIKM.

[16]  Tok Wang Ling,et al.  Effective XML Keyword Search with Relevance Oriented Ranking , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[17]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[18]  H. V. Jagadish,et al.  Assisted querying using instant-response interfaces , 2007, SIGMOD '07.

[19]  Jianxin Li,et al.  Suggestion of promising result types for XML keyword search , 2010, EDBT '10.

[20]  Menzo Windhouwer,et al.  Querying XML documents made easy: nearest concept queries , 2001, Proceedings 17th International Conference on Data Engineering.