Querying Semantically Related Items Using Modified 4-Index Scheme for XML Documents

Search engines use indexing schemes to index documents for efficient search and retrieval. However these search engines, take XML documents as a unit and ignore the fact that XML documents may contain records and objects that are required to be indexed in parts. Indexing XML documents as a whole, may result in irrelevant and inaccurate results to a query. In this paper, we have defined a technique used to store and index XML documents using advantages of RDBMS in order to overcome weakness in XML based search engines. This indexing scheme preserves the information related to the document structure, which is helpful in defining the semantic of the document. This incorporation of structural information enables the semantic base retrieval in context of XML documents, which in return, helps in retrieving the accurate/relevant results. To demonstrate the working of indexing scheme, a case study with examples is also exhibited.

[1]  XML parsing: a threat to database performance , 2003, CIKM '03.

[2]  Yehoshua Sagiv,et al.  XSEarch: A Semantic Search Engine for XML , 2003, VLDB.

[3]  Sihem Amer-Yahia,et al.  ShreX: Managing XML Documents in Relational Databases , 2004, VLDB.

[4]  Alin Deutsch,et al.  XML-QL: A Query Language for XML , 1998 .

[5]  Sang-Won Lee,et al.  An efficient inverted index technique for XML documents using RDBMS , 2003, Inf. Softw. Technol..

[6]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[7]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[8]  Dan Suciu,et al.  On database theory and XML , 2001, SGMD.

[9]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[10]  Daniela Florescu,et al.  XML and relational database management systems: the inside story , 2005, SIGMOD '05.

[11]  Sven Groppe,et al.  XPath query transformation based on XSLT stylesheets , 2003, WIDM '03.

[12]  Georg Gottlob,et al.  The complexity of XPath query evaluation , 2003, PODS.

[13]  Donald D. Chamberlin,et al.  XQuery: a query language for XML , 2003, SIGMOD '03.

[14]  Wesley W. Chu,et al.  Using a compact tree to index and query XML data , 2004, CIKM '04.

[15]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[16]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[17]  Vldb Endowment,et al.  The VLDB journal : the international journal on very large data bases. , 1992 .

[18]  Guido Moerkotte,et al.  Evaluating Queries on Structure with eXtended Access Support Relations , 2000, WebDB.

[19]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[20]  Advisor S. D. Kuznetsov XML Storing and Processing Techniques , 2004 .

[21]  Flavio Rizzolo ToXin, an indexing scheme for XML data , 2001 .

[22]  Saliha Smadhi System of Information Retrieval in XML Documents , 2003, Effective Databases for Text & Document Management.