Database Compression with Data Mining Methods

Despite the drop in prices, storage cost is still a major cost factor in large scale database applications, such as data warehouses. Data compression is needed to reduce the cost. Many data compression techniques have been proposed and the issue of database compression has been discussed. Conventional data compression techniques require that compressed data be decompressed before read operations or write operations can be carried out. As a result, it is not practical to compress databases in active use using the conventional data compression techniques. In this chapter, we propose a database compression technique which needs only partial decompression for read operations and no decompression for write operations. It is suitable for databases in active use and can be used to compress data in relational databases. The proposed technique finds rules in a relational database using the Apriori Algorithm and store data using rules to achieve high compression ratios. The rules are in turn stored in a deductive database to enable easy data access.

[1]  Catriel Beeri,et al.  On the power of magic , 1987, J. Log. Program..

[2]  Mark A. Roth,et al.  Database compression , 1993, SGMD.

[3]  John F. Roddick,et al.  Handling Discovered Structure in Database Systems , 1996, IEEE Trans. Knowl. Data Eng..

[4]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[5]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[6]  David Maier,et al.  Magic sets and other strange ways to implement logic programs (extended abstract) , 1985, PODS '86.

[7]  Shojiro Nishio,et al.  Knowledge Discovery in Deductive Databases with Large Deduction Results: the First Step , 1996, IEEE Trans. Knowl. Data Eng..

[8]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[9]  Donald E. Knuth,et al.  Dynamic Huffman Coding , 1985, J. Algorithms.

[10]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[11]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[12]  Akifumi Makinouchi,et al.  A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model , 1977, VLDB.

[13]  D. Huffman A Method for the Construction of Minimum-Redundancy Codes , 1952 .

[14]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.