Data compression and query for large scale sensor data on COTS DBMS

Multi-dimensional temporal data set is the common format in sensor network applications to store sampled temporal data. As time goes on, the size of the core tables in the data set may increase to enormous size and the tables become not managable. In order to reduce storage space and allow on-line query, how to trade off data compression effectiveness for on-line query performance is a challenge issue. In this paper, we are concerned with an effective framework for temporal data set that does not scarify on-line query performance and is specifically designed for very large sensor network database. The sampled data are compressed using several candidate approaches including dictionary-base compress and lossless vector quantization. In the mean time, on-line queries are conducted without decompressing the compressed data set so as to enhance the query performance. Experiments are conducted on a power meter database and sonoma database to evaluate the proposed methodologies in terms of data compression rate and data query speed. The results show that the compression rate ranges from 70% for numerical data to 20% for character data. In the mean time, the increased overhead for online query is limited up to 2%.

[1]  Kay Römer,et al.  The design space of wireless sensor networks , 2004, IEEE Wireless Communications.

[2]  Sven Helmer,et al.  The implementation and performance of compressed databases , 2000, SGMD.

[3]  Chee-Yee Chong,et al.  Sensor networks: evolution, opportunities, and challenges , 2003, Proc. IEEE.

[4]  Johannes Gehrke,et al.  Query optimization in compressed database systems , 2001, SIGMOD '01.

[5]  Mark A. Roth,et al.  Database compression , 1993, SGMD.

[6]  Wei Hong,et al.  A macroscope in the redwoods , 2005, SenSys '05.

[7]  Garret Swart,et al.  How to wring a table dry: entropy compression of relations and querying of compressed relations , 2006, VLDB.

[8]  Kay Römer,et al.  BitMAC: a deterministic, collision-free, and robust MAC protocol for sensor networks , 2005, Proceeedings of the Second European Workshop on Wireless Sensor Networks, 2005..

[9]  Goetz Graefe,et al.  Data compression and database performance , 1991, [Proceedings] 1991 Symposium on Applied Computing.

[10]  Chinya V. Ravishankar,et al.  Relational database compression using augmented vector quantization , 1995, Proceedings of the Eleventh International Conference on Data Engineering.