Compression Using Encryption

The basic objective of a data compression algorithm is to reduce the redundancy in data representation so as to decrease the data storage requirement. Data compression also provides an approach to reduce communication cost by effectively utilizing the available bandwidth. Data compression becomes important as file storage becomes a problem. In general, data compression consists of taking a stream of symbols and transforming them into codes. If the compression is effective, the resulting stream of codes will be smaller than the original symbols. The decision to output a certain code for a certain symbol or set of symbols is based on a model. The model as described in Nelson and Gailly [1] is simply a collection of data and rules used to process input symbols and determine which code(s) to output. A program uses the model to accurately define the probabilities for each symbol and the coder to produce an appropriate code based on those probabilities. Text compression can be divided into two categories, statistical-based and dictionary-based. In statistical models, the technique encodes a single symbol at a time by reading it in, calculating a probability, then outputting a single code. A dictionary-based compression scheme [2–4] uses a different approach. It reads in input data and looks for groups of symbols that appear in the dictionary. If a string match is found, a pointer or index into the dictionary can be output instead of the code for the symbol. Thus the longer the match found, the better is the compression ratio. The dictionary-based approach is also divided into two parts, static dictionary methods and dynamic dictionary methods. The static dictionary is built before compression occurs and it does not change while the data are being compressed. This dictionary is common to both the encoder and decoder ends. In the dynamic dictionary-based methods, the meaning of dynamic is that we start out either with no dictionary or with a default baseline dictionary. As compression proceeds, the algorithm adds new phrases to be used later as encoded tokens. An example of dictionary-based text compression is LZW developed by Welch [1].

[1]  C. H. Wong,et al.  Dynamic word based text compression , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[2]  Peter K. Pearson,et al.  Fast hashing of variable-length text strings , 1990, CACM.

[3]  Mark Nelson,et al.  The Data Compression Book, 2nd Edition , 1996 .

[4]  Amar Mukherjee,et al.  Data compression using encrypted text , 1996, Proceedings of Data Compression Conference - DCC '96.

[5]  Simon L. Peyton Jones,et al.  Word - based dynamic algorithms for data compression , 1992 .

[6]  R. Nigel Horspool,et al.  Constructing word-based text compression algorithms , 1992, Data Compression Conference, 1992..