Quantized Indexing Tree for Frequent Updates over Data Streams

Emerging hardware and communication technologies enable new data stream applications that deal efficiently with very high rates of data updates. In this paper, we propose a novel index structure, termed the QDM-tree (quantized R*-tree with double memos), to support efficient similarity search over data streams. We integrate quantized minimum bounding spheres (QMBSs) and quantized minimum bounding rectangles (QMBRs) which can improve the cache behavior of QDM-tree due to effectively pack more entries in a node and reduce the tree height. Two compact main memory memos (Insert Memo and Delete Memo) can accelerate the speed of insert operations and convert the cost of update to the cost of insert. Therefore, the RamDisk technique reduces the cost of disk accesses to the cost of memory accesses. Theoretical analysis and experimental evaluation demonstrate that the QDM-tree significantly outperforms other state of the art R-tree variants with frequent updates, and is more suitable for massive data streams.

[1]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[2]  Sukho Lee,et al.  Indexing the current positions of moving objects using the lazy update R-tree , 2002, Proceedings Third International Conference on Mobile Data Management MDM 2002.

[3]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[4]  Thomas Brinkhoff,et al.  Generating network-based moving objects , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[5]  Christian Böhm,et al.  Independent quantization: an index compression technique for high-dimensional data spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[6]  Jimeng Sun,et al.  The TPR*-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries , 2003, VLDB.

[7]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[8]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[9]  Beng Chin Ooi,et al.  Contorting high dimensional data for efficient main memory KNN processing , 2003, SIGMOD '03.

[10]  Christian S. Jensen,et al.  Indexing the positions of continuously moving objects , 2000, SIGMOD '00.

[11]  Sunil Prabhakar,et al.  Change tolerant indexing for constantly evolving data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[12]  Jianwen Su,et al.  Handling frequent updates of moving objects , 2005, CIKM '05.

[13]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[14]  Christian S. Jensen,et al.  Main-Memory Operation Buffering for Efficient R-Tree Update , 2007, VLDB.

[15]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[16]  Masatoshi Yoshikawa,et al.  The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation , 2000, VLDB.

[17]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[18]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[19]  Christian S. Jensen,et al.  Techniques for efficient road-network-based tracking of moving objects , 2005, IEEE Transactions on Knowledge and Data Engineering.

[20]  Kihong Kim,et al.  Optimizing multidimensional index trees for main memory access , 2001, SIGMOD '01.

[21]  Walid G. Aref,et al.  R-trees with Update Memos , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[22]  Beng Chin Ooi,et al.  Indexing high-dimensional data for efficient in-memory similarity search , 2005, IEEE Transactions on Knowledge and Data Engineering.

[23]  Mong-Li Lee,et al.  Supporting Frequent Updates in R-Trees: A Bottom-Up Approach , 2003, VLDB.

[24]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[25]  Xiaohui Yu,et al.  CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).