Investigating Randomness of the LZSS Compression Algorithm

Random Number Generators play a critical role in a number of important applications. In practice, statistical testing is employed to gather evidence that a generator indeed produces numbers that appear to be random. In this paper, we address the issue on the random property of compressed data via LZSS compression algorithm. Our test results suggest that the output of LZSS has bad randomness. We also investigate the randomization methods using the LZSS. A pseudo-random sequence generator (PRNG), L12RC4, inspired by the LZSS compression algorithm and RC4 stream cipher, was presented and implemented. The result of the NIST and Diehard test suite indicate that the L12RC4 is a good PRNG, and so it seems to be sound and may be suitable for use in some cryptographic applications. We also found that the probability distribution of the index value frequency is associated with the compression pass and INDEX_BIT_COUNT value. As for one pass mode, the greater INDEX_BIT_COUNT value, the more uniformly distributed, and the double pass mode has better uniformity than the one pass mode.