Temporal XML: modeling, indexing, and query processing

In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time. We study the temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries: LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding. We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index, and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible with real-world requirements.

[1]  Alberto O. Mendelzon,et al.  Indexing Temporal XML Documents , 2004, VLDB.

[2]  Hao He,et al.  Multiresolution indexing of XML for frequent queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[3]  Keishi Tajima,et al.  Archiving scientific data , 2002, SIGMOD '02.

[4]  Alon Y. Halevy,et al.  Updating XML , 2001, SIGMOD '01.

[5]  Neoklis Polyzotis,et al.  Structure and Value Synopses for XML Data Graphs , 2002, VLDB.

[6]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.

[7]  Wenfei Fan,et al.  Keys for XML , 2002, Comput. Networks.

[8]  Wenfei Fan,et al.  Integrity constraints for XML , 2000, PODS.

[9]  Neoklis Polyzotis,et al.  Approximate XML query answers , 2004, SIGMOD '04.

[10]  Christian S. Jensen,et al.  On the Semantics of , 1996 .

[11]  Fabio Grandi,et al.  Effective representation and efficient management of indeterminate dates , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[12]  Christian S. Jensen,et al.  On the semantics of “now” in databases , 1996, TODS.

[13]  Curtis E. Dyreson,et al.  Temporal XML , 2009, Encyclopedia of Database Systems.

[14]  Haim Kaplan,et al.  A comparison of labeling schemes for ancestor queries , 2002, SODA '02.

[15]  Serge Abiteboul,et al.  The Xyleme project , 2002, Comput. Networks.

[16]  Toshiyuki Amagasa,et al.  A Data Model for Temporal XML Documents , 2000, DEXA.

[17]  Carlo Zaniolo,et al.  Efficient Management of Multiversion Documents by Object Referencing , 2001, VLDB.

[18]  Jeffrey F. Naughton,et al.  Updates for Structure Indexes , 2002, VLDB.

[19]  Fabio Grandi Introducing an annotated bibliography on temporal and evolution aspects in the World Wide Web , 2004, SGMD.

[20]  Christian S. Jensen,et al.  Capturing and Querying Multiple Aspects of Semistructured Data , 1999, VLDB.

[21]  Vishu Krishnamurthy,et al.  Performance Challenges in Object-Relational DBMSs , 1999, IEEE Data Eng. Bull..

[22]  Richard T. Snodgrass,et al.  The TSQL2 Temporal Query Language , 1995 .

[23]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[24]  Kyuseok Shim,et al.  APEX: an adaptive path index for XML data , 2002, SIGMOD '02.

[25]  Fusheng Wang,et al.  XBiT: An XML-Based Bitemporal Data Model , 2004, ER.

[26]  Letizia Tanca,et al.  Temporal aspects of semistructured data , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[27]  Jennifer Widom,et al.  Managing Historical Semistructured Data , 1999, Theory Pract. Object Syst..

[28]  Richard T. Snodgrass,et al.  Efficient XML-based Techniques for Archiving , Querying and Publishing the Histories of Relational Databases , 2005 .

[29]  Vassilis J. Tsotras,et al.  Comparison of access methods for time-evolving data , 1999, CSUR.

[30]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[31]  Curtis E. Dyreson,et al.  Supporting valid-time indeterminacy , 1998, TODS.

[32]  Curtis E. Dyreson,et al.  Observing transaction-time semantics with /sub TT/XPath , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[33]  Andrew Lim,et al.  D(k)-index: an adaptive structural summary for graph-structured data , 2003, SIGMOD '03.

[34]  Gerhard Weikum,et al.  HOPI: An Efficient Connection Index for Complex XML Document Collections , 2004, EDBT.

[35]  Fusheng Wang,et al.  Temporal queries in XML document archives and web warehouses , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[36]  Jeffrey D. Ullman,et al.  Representative objects: concise representations of semistructured, hierarchical data , 1997, Proceedings 13th International Conference on Data Engineering.

[37]  Richard T. Snodgrass,et al.  Temporal Slicing in the Evaluation of XML Queries , 2003, VLDB.

[38]  Philip Wadler,et al.  A Formal Semantics of Patterns in XSLT and XPath , 2000, Markup languages.

[39]  Sushil Jajodia,et al.  Temporal Databases: Theory, Design, and Implementation , 1993 .

[40]  Manolis Gergatsoulis,et al.  Representing Changes in XML Documents using Dimensions , 2003, Xsym.

[41]  Amélie Marian,et al.  Change-Centric Management of Versions in an XML Warehouse , 2001, VLDB.

[42]  Tova Milo,et al.  Optimizing queries on files , 1994, SIGMOD '94.

[43]  Neoklis Polyzotis,et al.  XCluster Synopses for Structured XML Content , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[44]  Sabrina De Capitani di Vimercati,et al.  An authorization model for temporal XML documents , 2002, SAC '02.

[45]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[46]  Carlo Zaniolo,et al.  Version Management of XML Documents , 2000, WebDB.

[47]  Neoklis Polyzotis,et al.  Statistical synopses for graph-structured XML databases , 2002, SIGMOD '02.

[48]  Daniela Florescu,et al.  Storing and Querying XML Data using an RDMBS , 1999, IEEE Data Eng. Bull..

[49]  Dan Suciu,et al.  XMill: an efficient compressor for XML data , 2000, SIGMOD '00.

[50]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[51]  Jeffrey F. Naughton,et al.  Covering indexes for branching path queries , 2002, SIGMOD '02.

[52]  Sushil Jajodia,et al.  Temporal Databases: Research and Practice , 1998 .

[53]  Fabio Grandi,et al.  The Valid Web: An XML/XSL Infrastructure for Temporal Management of Web Documents , 2000, ADVIS.

[54]  Nicola Santoro,et al.  Labelling and Implicit Routing in Networks , 1985, Computer/law journal.

[55]  Abel,et al.  A formal semantics of patterns in XSLT , 2000 .

[56]  Richard T. Snodgrass,et al.  Syntax, Semantics, and Query Evaluation in the XQuery Temporal XML Query Language , 2003 .

[57]  Fusheng Wang,et al.  Temporal XML? SQL strikes back! , 2005, 12th International Symposium on Temporal Representation and Reasoning (TIME'05).

[58]  Z. Meral Özsoyoglu,et al.  Indexing Valid Time Intervals , 1998, DEXA.

[59]  Jan Chomicki,et al.  Temporal Query Languages: A Survey , 1994, ICTL.

[60]  Alberto O. Mendelzon,et al.  Indexing XML Data with ToXin , 2001, WebDB.

[61]  Hao He,et al.  Incremental maintenance of XML structural indexes , 2004, SIGMOD '04.

[62]  Yossi Matias,et al.  Fractional XSketch Synopses for XML Databases , 2004, XSym.