Managing highly correlated semi-structured data: architectural aspects of a digital archive

XML techniques are well suited to describe, manage, store, and exchange hierarchical, semi-structured data. Information represented beyond hierarchical structures can still be described and exchanged in XML format employing additional concepts such as ID/IDREF or XLink. However, retrieval, manipulation, and storage mechanisms are far away from being the ideal solution for such data. Query languages do not perform efficiently in these cases. Especially in scenarios, such as the Digital Wossidlo Archive (WossiDiA), a project dealing with a huge number of arbitrarily correlated data units, XML query evaluation and retrieval techniques face problems, such as intricate querying and bad efficiency. At this point a solution to manage these data efficiently needs to be devised. This paper introduces a first approach which attempts to find such a solution for the WossiDiA information system.

[1]  Gultekin Özsoyoglu,et al.  A graph query language and its query processing , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[2]  Wolfgang May,et al.  Handling Interlinked XML Instances on the Web , 2006, EDBT.

[3]  Wenjun Sun,et al.  Parallel Query Processing Algorithms for Semi-structured Data , 2002, CAiSE.

[4]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[5]  I L Kaplan A Semantic Graph Query Language , 2006 .

[6]  Steven J. DeRose,et al.  Xml linking language (xlink), version 1. 0 , 2000, WWW 2000.

[7]  Dmitry Lizorkin The Query Language to XML Documents Connected by XLink Links , 2005, Programming and Computer Software.

[8]  Swapan Bhattacharya,et al.  GDM: A New Graph Based Data Model Using Functional Abstractionx , 2006, Journal of Computer Science and Technology.

[9]  Catriel Beeri,et al.  SAL: An Algebra for Semistructured Data and XML , 1999, WebDB.

[10]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[11]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[12]  Torsten Grust,et al.  Relational Algebra: Mother Tongue - XQuery: Fluent , 2004, TDM.

[13]  David Beech,et al.  XML-Schema Part 1: Structures Second Edition , 2004 .

[14]  Michael J. Franklin,et al.  A Fast Index for Semistructured Data , 2001, VLDB.

[15]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[16]  A. Brandstädt,et al.  Graph Classes: A Survey , 1987 .