Cryptanalysis of Lempel-Ziv Compressed and Encrypted Text: The Statistics of Compression

Modern secure communication systems typically follow a pattern at the transmitter of first compression encoding followed by encryption, and then additional encoders to mitigate the effects of channel noise, etc. One of the purposes of the compression algorithm is to remove statistical information about the plaintext, so as to render the ciphertext impervious to statistical attacks. It is well known, however, that in practice there is no such thing as a universal compression algorithm; thus, some statistical information about the plaintext tends to survive the compression process. In this paper, we consider Lempel-Ziv Welch compression and analyze its effectiveness in removing statistical information from English plaintext. Specifically, we present several techniques for exploiting the structure of the compression algorithm to launch a successful statistical attack on compressed and encrypted data. All attacks are ciphertext only, and one of them relies on linear programming. Although our attacks indicate that an eavesdropper may require additional ciphertext to carry out a successful attack if compression is used, the specific adaptive nature of the Lempel-Ziv compression technique leaves its own statistics on the message, which can be exploited by an attacker.

[1]  Dar-Shyang Lee,et al.  Substitution Deciphering Based on HMMs with Applications to Compressed Document Processing , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Willie K. Harrison,et al.  An analysis of an HMM-based attack on the substitution cipher with error-prone ciphertext , 2014, 2014 IEEE International Conference on Communications (ICC).

[3]  Cong Ling,et al.  Semantically Secure Lattice Codes for the Gaussian Wiretap Channel , 2012, IEEE Transactions on Information Theory.

[4]  John Kelsey,et al.  Compression and Information Leakage of Plaintext , 2002, FSE.

[5]  Jan Laan,et al.  Master System and Network Engineering University of Amsterdam Cryptanalysis of , and practical attacks against E-Safenet encryption , 2014 .

[6]  Azriel Rosenfeld,et al.  Breaking substitution ciphers using a relaxation algorithm , 1979, CACM.

[7]  Andrea Sgarro Error probabilities for simple substitution ciphers , 1983, IEEE Trans. Inf. Theory.

[8]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[9]  Tadayoshi Kohno Analysis of the WinZip encryption method , 2004, IACR Cryptol. ePrint Arch..

[10]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[11]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[12]  Edwin Olson,et al.  Robust Dictionary Attack of Short Simple Substitution Ciphers , 2007, Cryptologia.

[13]  Roberto Tamassia,et al.  Secure Compression: Theory \& Practice , 2014, IACR Cryptol. ePrint Arch..