Multi-dimensional Aggregation for Temporal Data

Business Intelligence solutions, encompassing technologies such as multi-dimensional data modeling and aggregate query processing, are being applied increasingly to non-traditional data. This paper extends multi-dimensional aggregation to apply to data with associated interval values that capture when the data hold. In temporal databases, intervals typically capture the states of reality that the data apply to, or capture when the data are, or were, part of the current database state. This paper proposes a new aggregation operator that addresses several challenges posed by interval data. First, the intervals to be associated with the result tuples may not be known in advance, but depend on the actual data. Such unknown intervals are accommodated by allowing result groups that are specified only partially. Second, the operator contends with the case where an interval associated with data expresses that the data holds for each point in the interval, as well as the case where the data holds only for the entire interval, but must be adjusted to apply to sub-intervals. The paper reports on an implementation of the new operator and on an empirical study that indicates that the operator scales to large data sets and is competitive with respect to other temporal aggregation algorithms.

[1]  Theodore Johnson,et al.  The MD-join: an operator for complex OLAP , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Thomas Seidl,et al.  Joining interval data in relational databases , 2004, SIGMOD '04.

[3]  Richard T. Snodgrass,et al.  Spatiotemporal aggregate computation: a survey , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  Bongki Moon,et al.  Efficient Algorithms for Large-Scale Temporal Aggregation , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Christos Faloutsos,et al.  Approximate temporal aggregation , 2004, Proceedings. 20th International Conference on Data Engineering.

[6]  Michael H. Böhlen,et al.  Efficient computation of subqueries in complex OLAP , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Richard T. Snodgrass,et al.  Computing temporal aggregates , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[8]  Dimitrios Gunopulos,et al.  Efficient computation of temporal aggregates with range predicates , 2001, PODS '01.

[9]  Jennifer Widom,et al.  Incremental computation and maintenance of temporal aggregates , 2003, The VLDB Journal.

[10]  Laks V. S. Lakshmanan,et al.  Efficient OLAP Query Processing in Distributed Data Warehouses , 2002, EDBT.