An effective and versatile keyword search engine on heterogenous data sources

We present EASE, an effective and versatile keyword search engine that enables users to easily access the heterogenous data composed of unstructured, semi-structured and structured data, without the need of learning XPath/XQuery or SQL languages. EASE addresses a challenge in keyword search that has been neglected in the literature: how to efficiently and adaptively process keyword queries on the heterogenous data. To provide such capability, EASE models unstructured, semi-structured and structured data as graphs, summarizes the graphs, and constructs graph indices instead of using traditional inverted indices for effective keyword search. EASE adopts an extended inverted index to facilitate keyword-based search, and employs a novel ranking mechanism for enhancing search effectiveness.

[1]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[2]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.

[3]  Jianyong Wang,et al.  Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents , 2008, WWW.

[4]  Beng Chin Ooi,et al.  EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data , 2008, SIGMOD Conference.

[5]  Lin Guo XRANK : Ranked Keyword Search over XML Documents , 2003 .

[6]  Feng Lin,et al.  Progressive Ranking for Efficient Keyword Search over Relational Databases , 2008, BNCOD.

[7]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[9]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Luis Gravano,et al.  Efficient IR-Style Keyword Search over Relational Databases , 2003, VLDB.

[11]  Bei Yu,et al.  Race: finding and ranking compact connected trees for keyword proximity search over xml documents , 2008, WWW.

[12]  Guoliang Li,et al.  Retune: Retrieving and Materializing Tuple Units for Effective Keyword Search over Relational Databases , 2008, ER.

[13]  Guoliang Li,et al.  Efficient Keyword Search over Data-Centric XML Documents , 2007, APWeb/WAIM.