INFORMATION RETRIEVAL USING INDEXING SCHEME FOR TREE PATTERN FRAMEWORK

Indexing an XML database in data warehouse is a complex problem. The major rationale for indexing XML database in data warehouse is owing to the heterogeneous and structural environment of XML data that can construct query pattern tedious. Existing techniques focused on clustering methods based on integrating the data warehouse with web data for Online Analytical Processing (OLAP) techniques. Through clustering process, fast retrieval of information is impossible because clustering technique exactly used for tree pattern building framework. Most XML indexing strategies split it into several sub-queries, and subsequently connect their results to present the response to the unique query. Join operations have been determined as the mainly time-consuming component in XML query processing for information retrieval. To enhance the search criteria in XML database present in the data warehouse, in this paper, an indexing scheme is used which separates the data based on the objective. An indexing technique XSeq is presented based on the tree structure pattern framework. XSeq constructs its indexing infrastructure framework on a much simpler and symbolize both XML data and XML queries as formation encoded sequences. Furthermore, the XSeq infrastructure unites both the content and the construction of XML documents, thus it attains a further presentation over indexing both just content and construction, or indexing them individually. A reliable performance improvement is achieved with the proposed IRIS (Information Retrieval using Indexing Scheme) in XML database to data warehouse, compared to an existing SDC technique for OLAP, in terms of search path length, search cost, Maintenance.

[1]  Qiong Luo,et al.  DPTree: A Distributed Pattern Tree Index for Partial-Match Queries in Peer-to-Peer Networks , 2006, EDBT.

[2]  Chen Wang,et al.  Extended XML Tree Pattern Matching: Theories and Algorithms , 2011, IEEE Transactions on Knowledge and Data Engineering.

[3]  Tok Wang Ling,et al.  TwigStackList-: A Holistic Twig Join Algorithm for Twig Query with Not-Predicates on XML Data , 2006, DASFAA.

[4]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  Tok Wang Ling,et al.  Towards an Effective XML Keyword Search , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Hongjun Lu,et al.  Efficient Processing of XML Path Queries Using the Disk-based F&B Index , 2005, VLDB.

[7]  Xiaofeng Meng,et al.  XSeq: an indexing infrastructure for tree pattern queries , 2004, SIGMOD '04.

[8]  Ziv Bar-Yossef,et al.  The Space Complexity of Processing XML Twig Queries Over Indexed Documents , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[9]  Kengo Yoshida,et al.  Evolution of multiple tree structured patterns using soft clustering , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[10]  Liang Gou,et al.  TreeNetViz: Revealing Patterns of Networks over Tree Structures , 2011, IEEE Transactions on Visualization and Computer Graphics.