On scrambling the Burrows-Wheeler transform to provide privacy in lossless compression

The usual way of ensuring the confidentiality of the compressed data is to encrypt it with a standard encryption algorithm. Although the computational cost of encryption is practically tolerable in most cases, the lack of flexibility to perform pattern matching on the compressed data due to the encryption level is the main disadvantage. Another alternative to provide privacy in compression is to alter the compression algorithms in such a way that the decompression requires the knowledge of some secret parameters. Securing the arithmetic and Huffman coders along with the dictionary based schemes have been previously studied, where Burrows-Wheeler transform (BWT) has not been addressed before in that sense. On BWT of an input data it is not possible to perform a successful search nor construct any part of it without the proper knowledge of the lexicographical ordering used in the construction. Based upon this observation, this study investigates methods to provide privacy in BWT by using a randomly selected permutation of the input symbols as the lexicographical order. The proposed technique aims to support pattern matching on compressed data, while still retaining the confidentiality. Unifying compression and security in a single step is also considered instead of the two-level compress-then-encrypt paradigm.

[1]  Hyungjin Kim,et al.  Secure Arithmetic Coding , 2007, IEEE Transactions on Signal Processing.

[2]  M. Bellare,et al.  Searchable Encryption Revisited: Consistency Properties, Relation to Anonymous IBE, and Extensions , 2008, Journal of Cryptology.

[3]  Robert E. Tarjan,et al.  A Locally Adaptive Data , 1986 .

[4]  Travis Gagie,et al.  Move-to-Front, Distance Coding, and Inversion Frequencies revisited , 2010, Theor. Comput. Sci..

[5]  Oscar C. Au,et al.  Adaptive Chosen-Ciphertext Attack on Secure Arithmetic Coding , 2009, IEEE Transactions on Signal Processing.

[6]  Eli Biham,et al.  A Known Plaintext Attack on the PKZIP Stream Cipher , 1994, FSE.

[7]  Oscar C. Au,et al.  Secure Lempel-Ziv-Welch (LZW) algorithm with random dictionary insertion and permutation , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[8]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[9]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[10]  Ed Dawson,et al.  Cryptanalysis of Adaptive Arithmetic Coding Encryption Schemes , 1997, ACISP.

[11]  Enrico Magli,et al.  Selective encryption of JPEG 2000 images by means of randomized arithmetic coding , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[12]  Tadayoshi Kohno,et al.  Attacking and repairing the winZip encryption scheme , 2004, CCS '04.

[13]  Raphael C.-W. Phan,et al.  On the security of the WinRAR encryption feature , 2006, International Journal of Information Security.

[14]  Colin Boyd,et al.  Resisting the Bergen-Hogan Attack on Adaptive Arithmetic Coding , 1997, IMACC.

[15]  Michael Stay ZIP Attacks with Reduced Known Plaintext , 2001, FSE.

[16]  Enrico Magli,et al.  Multimedia Selective Encryption by Means of Randomized Arithmetic Coding , 2006, IEEE Transactions on Multimedia.

[17]  Wing-Kai Hon,et al.  Geometric Burrows-Wheeler Transform: Linking Range Searching and Text Indexing , 2008, Data Compression Conference (dcc 2008).

[18]  John G. Cleary,et al.  On the insecurity of arithmetic coding , 1995, Comput. Secur..

[19]  A. Barbir A methodology for performing secure data compression , 1997, Proceedings The Twenty-Ninth Southeastern Symposium on System Theory.

[20]  James M. Hogan,et al.  A chosen plaintext attack on an adaptive arithmetic coding compression algorithm , 1993, Comput. Secur..

[21]  Shmuel Tomi Klein,et al.  Compressed Pattern Matching in Jpeg Images , 2006, Int. J. Found. Comput. Sci..

[22]  Stephen R. Tate,et al.  Higher compression from the Burrows-Wheeler transform by modified sorting , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[23]  C.-C. Jay Kuo,et al.  Design of integrated multimedia compression and encryption systems , 2005, IEEE Transactions on Multimedia.

[24]  Wenjun Zeng,et al.  Efficient frequency domain selective scrambling of digital video , 2003, IEEE Trans. Multim..

[25]  Ian H. Witten,et al.  On the privacy afforded by adaptive text compression , 1988, Comput. Secur..

[26]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[27]  Ayumi Shinohara,et al.  A Boyer-Moore Type Algorithm for Compressed Pattern Matching , 2000, CPM.

[28]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[29]  Tao Tao,et al.  Compressed pattern matching for text and images , 2005 .

[30]  James M. Hogan,et al.  Data security in a fixed-model arithmetic coding compression algorithm , 1992, Comput. Secur..

[31]  Wing-Kai Hon,et al.  Compression, Indexing, and Retrieval for Massive String Data , 2010, CPM.

[32]  Donald A. Adjeroh,et al.  Searching BWT compressed text with the Boyer-Moore algorithm and binary search , 2002, Proceedings DCC 2002. Data Compression Conference.

[33]  Vincent Rijmen,et al.  The Design of Rijndael: AES - The Advanced Encryption Standard , 2002 .

[34]  Gary Benson,et al.  Let sleeping files lie: pattern matching in Z-compressed files , 1994, SODA '94.

[35]  Haim Kaplan,et al.  Most Burrows-Wheeler Based Compressors Are Not Optimal , 2007, CPM.

[36]  Travis Gagie,et al.  Move-to-Front, Distance Coding, and Inversion Frequencies Revisited , 2007, CPM.