KWilt: A Semantic Patchwork for Flexible Access to Heterogeneous Knowledge

Semantic wikis and other modern knowledge management systems deviate from traditional knowledge bases in that information ranges from unstructured (wiki pages) over semi-formal (tags) to formal (RDF or OWL) and is produced by users with varying levels of expertise. KWQL is a query language for semantic wikis that scales with a user's level of expertise by combining ideas from keyword query languages with aspects of formal query languages such as SPARQL. In this paper, we discuss KWQL's implementation KWilt: It uses, for each data format and query type, technology tailored to that setting and combines, in a patchwork fashion, information retrieval, structure matching and constraint evaluation tools with only lightweight "glue". We show that it is possible to efficiently recognize KWQL queries that can be evaluated using only information retrieval or information retrieval and structure matching. This allows KWilt to evaluate basic queries at almost the speed of the underlying search engine, yet also provides all the power of full first-order queries, where needed. Moreover, adding new data formats or abilities is easier than in a monolithic system.

[1]  Haofen Wang,et al.  Semplore: A scalable IR approach to search the Web of Data , 2009, J. Web Semant..

[2]  Malcolm P. Atkinson,et al.  Issues Raised by Three Years of Developing PJama: An Orthogonally Persistent Platform for Java , 1999, ICDT.

[3]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[4]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[5]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[6]  Masatoshi Yoshikawa,et al.  Full-Text and Structural Indexing of XML Documents on B+-Tree , 2006, IEICE Trans. Inf. Syst..

[7]  Karl Aberer,et al.  Combining Pat-Trees and Signature Files for Query Evaluation in Document Databases , 1999, DEXA.

[8]  Jeffrey F. Naughton,et al.  On the integration of structure indexes and inverted lists , 2004, Proceedings. 20th International Conference on Data Engineering.

[9]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[10]  François Bry,et al.  Content-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data , 2004, ECIR.

[11]  Wesley W. Chu,et al.  Ctree: a compact tree for indexing XML data , 2004, WIDM '04.

[12]  Philip S. Yu,et al.  ViST: a dynamic index method for querying XML data by tree structures , 2003, SIGMOD '03.

[13]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.