Data compression and database performance

Data compression is widely used in data management to save storage space and network bandwidth. The authors outline the performance improvements that can be achieved by exploiting data compression in query processing. The novel idea is to leave data in compressed state as long as possible, and to only uncompress data when absolutely necessary. They show that many query processing algorithms can manipulate compressed data just as well as decompressed data, and that processing compressed data can speed query processing by a factor much larger than the compression factor.<<ETX>>

[1]  Pankaj Goyal Coding methods for text string search on compressed databases , 1983, Inf. Syst..

[2]  Hamid Pirahesh,et al.  Starburst Mid-Flight: As the Dust Clears , 1990, IEEE Trans. Knowl. Data Eng..

[3]  Gordon V. Cormack,et al.  Data compression on a database system , 1985, CACM.

[4]  James A. Storer,et al.  Data Compression: Methods and Theory , 1987 .

[5]  Ali R. Hurson,et al.  Parallel Architectures for Database Systems , 1989, Adv. Comput..

[6]  Goetz Graefe,et al.  Tuning a parallel database algorithm on a shared‐memory multiprocessor , 1992, Softw. Pract. Exp..

[7]  Dina Bitton,et al.  Arm scheduling in shadowed disks , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[8]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[9]  Doron Rotem,et al.  Rearranging data to maximize the efficiency of compression , 1985, PODS '86.

[10]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[11]  Jianzhong Li,et al.  A New Compression Method with Fast Searching on Large Databases , 1987, VLDB.

[12]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[13]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[14]  James A. Storer,et al.  Parallel algorithms for data compression , 1985, JACM.

[15]  Motomichi Toyama,et al.  Fixed length semiorder preserving code for field level data file compression , 1984, 1984 IEEE First International Conference on Data Engineering.

[16]  Dina Bitton,et al.  Disk Shadowing , 1988, VLDB.

[17]  Leonardo Felician,et al.  A nearly optimal Huffman technique in the microcomputer environment , 1987, Inf. Syst..

[18]  Daniel S. Hirschberg,et al.  Data compression , 1987, CSUR.

[19]  Donald E. Knuth,et al.  Dynamic Huffman Coding , 1985, J. Algorithms.

[20]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[21]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[22]  P.A. Alsberg,et al.  Space and time savings through large data base compression and dynamic restructuring , 1975, Proceedings of the IEEE.

[23]  Robert G. Gallager,et al.  Variations on a theme by Huffman , 1978, IEEE Trans. Inf. Theory.

[24]  Dennis G. Severance,et al.  A practitioner's guide to data base compression - Tutorial , 1983, Inf. Syst..

[25]  Harry K. T. Wong,et al.  Transposition Algorithms on Very Large Compressed Databases , 1986, VLDB.

[26]  Michael Rodeh,et al.  Linear Algorithm for Data Compression via String Matching , 1981, JACM.

[27]  Leonard D. Shapiro,et al.  Join processing in database systems with large main memories , 1986, TODS.

[28]  Ross N. Williams,et al.  Dynamic-history predictive compression , 1988, Inf. Syst..

[29]  David J. DeWitt,et al.  Database Machines: An Idea Whose Time Passed? A Critique of the Future of Database Machines , 1989, IWDM.

[30]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[31]  Clifford A. Lynch,et al.  Application of Data Compression to a Large Bibliographic Data Base , 1981, VLDB.

[32]  Arie Shoshani,et al.  Efficient Access of Compressed Data , 1980, VLDB.

[33]  Ian H. Witten,et al.  Modeling for text compression , 1989, CSUR.

[34]  Arie Shoshani,et al.  A Compression Technique for Large Statistical Data-Bases , 1981, VLDB.