SHIFT-SPLIT: I/O efficient maintenance of wavelet-transformed multidimensional data

The Discrete Wavelet Transform is a proven tool for a wide range of database applications. However, despite broad acceptance, some of its properties have not been fully explored and thus not exploited, particularly for two common forms of multidimensional decomposition. We introduce two novel operations for wavelet transformed data, termed SHIFT and SPLIT, based on the properties of wavelet trees, which work directly in the wavelet domain. We demonstrate their significance and usefulness by analytically proving six important results in four common data maintenance scenarios, i.e., transformation of massive datasets, appending data, approximation of data streams and partial data reconstruction, leading to significant I/O cost reduction in all cases. Furthermore, we show how these operations can be further improved in combination with the optimal coefficient-to-disk-block allocation strategy. Our exhaustive set of empirical experiments with real-world datasets verifies our claims.

[1]  Ambuj K. Singh,et al.  SWAT: hierarchical stream summarization in large networks , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[2]  Cyrus Shahabi,et al.  ProDA: a suite of web-services for progressive data analysis , 2005, SIGMOD '05.

[3]  Christos Faloutsos,et al.  AWSOM: Adaptive, Hands-Off Stream Mining , 2003 .

[4]  Christos Faloutsos,et al.  Adaptive, Hands-Off Stream Mining , 2003, VLDB.

[5]  David Salesin,et al.  Wavelets for computer graphics: theory and applications , 1996 .

[6]  C. Shahabi,et al.  Wavelet Disk Placement for E � cient Querying of Large Multidimensional Data Sets , 2003 .

[7]  Cyrus Shahabi,et al.  ProPolyne: A Fast Wavelet-Based Algorithm for Progressive Evaluation of Polynomial Range-Sum Queries , 2002, EDBT.

[8]  Jeffrey Scott Vitter,et al.  Data cube approximation and histograms via wavelets , 1998, CIKM '98.

[9]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[10]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[11]  Jeffrey Scott Vitter,et al.  Approximate computation of multidimensional aggregates of sparse data using wavelets , 1999, SIGMOD '99.

[12]  Divyakant Agrawal,et al.  Using wavelet decomposition to support progressive and approximate range-sum queries over data cubes , 2000, CIKM '00.

[13]  Nick Roussopoulos,et al.  Extended wavelets for multiple measures , 2003, SIGMOD '03.

[14]  Daniel Lemire Wavelet-based relative prefix sum methods for range sum queries in data cubes , 2002, CASCON.

[15]  Minos N. Garofalakis,et al.  Wavelet synopses with error guarantees , 2002, SIGMOD '02.