Authentication of LZ-77 compressed data

The formidable dissemination capability allowed by the current network technology makes it increasingly important to devise new methods to ensure authenticity. Nowadays it is common practice to distribute documents in compressed form. In this paper, we propose a simple variation on the classic LZ-77 algorithm that allows one to hide, within the compressed document, enough information to warrant its authenticity. The design is based on the unpredictability of a certain class of pseudo-random generators, in such a way that the hidden data cannot be retrieved in a reasonable amount of time by an attacker (unless the secret bit-string key is known). Since it can still be decompressed by the original LZ-77 algorithm, the embedding is completely "transparent". Preliminary experiments show also the degradation in compression due to the embedding is almost negligible.

[1]  Mikhail J. Atallah,et al.  Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation , 2001, Information Hiding.

[2]  Michael Rodeh,et al.  Linear Algorithm for Data Compression via String Matching , 1981, JACM.

[3]  Lawrence O'Gorman,et al.  Electronic marking and identification techniques to discourage document copying , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[4]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[5]  Manuel Blum,et al.  A Simple Unpredictable Pseudo-Random Number Generator , 1986, SIAM J. Comput..

[6]  O. Roeva,et al.  Information Hiding: Techniques for Steganography and Digital Watermarking , 2000 .

[7]  Sushil Jajodia,et al.  Exploring steganography: Seeing the unseen , 1998, Computer.

[8]  Edward J. Delp,et al.  Perceptual watermarks for digital images and video , 1999, Electronic Imaging.

[9]  Lawrence O'Gorman,et al.  Document marking and identification using both line and word shifting , 1995, Proceedings of INFOCOM'95.

[10]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[11]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[12]  Bruce Schneier,et al.  Applied cryptography (2nd ed.): protocols, algorithms, and source code in C , 1995 .

[13]  Markus G. Kuhn,et al.  Information hiding-a survey , 1999, Proc. IEEE.

[14]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[15]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[16]  Walter Bender,et al.  Techniques for data hiding , 1995, Electronic Imaging.

[17]  Jiri Fridrich,et al.  Image watermarking for tamper detection , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[18]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[19]  Steven H. Low,et al.  Copyright protection for the electronic distribution of text documents , 1999, Proc. IEEE.

[20]  Christian Cachin,et al.  An information-theoretic model for steganography , 1998, Inf. Comput..

[21]  Ian H. Witten,et al.  Text Compression , 1990, 125 Problems in Text Algorithms.

[22]  Minerva M. Yeung,et al.  Invisible watermarking for image verification , 1998, J. Electronic Imaging.