Improving Chinese Storing Text Retrieval Systems' Security via a Novel Maximal Prefix Coding

We have seen that Huffman coding has been widely used in data, image, and video compression. In this paper novel maximal prefix coding is introduced. Relationship between the Huffman coding and the optimal maximal prefix coding are discussed. We show that all Huffman coding schemes are optimal maximal prefix coding schemes and that conversely the optimal maximal prefix coding schemes need not be the Huffman coding schemes. Moreover, it is proven that, for any maximal prefix code C, there exists an information source I = (∑ P) such that C is exactly a Huffman code for I. Therefore, it is essential to show that the class of Huffman codes is coincident with one of the maximal prefix codes. A case study of data compression is also given. Comparing the Huffman coding, the maximal prefix coding is used not only for statistical modeling but also for dictionary methods. It is also good at applying to a large information retrieval system and improving its security.

[1]  Shawmin Lei,et al.  An entropy coding system for digital HDTV applications , 1991, IEEE Trans. Circuits Syst. Video Technol..

[2]  Jeffrey Scott Vitter,et al.  Design and analysis of dynamic Huffman codes , 1987, JACM.

[3]  Steven Roman Introduction to coding and information theory , 1997, Undergraduate texts in mathematics.

[4]  Marcel Paul Schützenberger,et al.  On the Synchronizing Properties of Certain Prefix Codes , 1964, Inf. Control..

[5]  Kou-Hu Tzou High-order entropy coding for images , 1992, IEEE Trans. Circuits Syst. Video Technol..

[6]  Umberto Eco,et al.  Theory of Codes , 1976 .

[7]  Joan L. Mitchell,et al.  JPEG: Still Image Data Compression Standard , 1992 .

[8]  H. Shyr Free monoids and languages , 1979 .

[9]  Thomas J. Ferguson,et al.  Self-synchronizing Huffman codes , 1984, IEEE Trans. Inf. Theory.

[10]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[11]  Mark Nelson,et al.  The Data Compression Book , 2009 .

[12]  Dongyang Long,et al.  On Group Codes , 1996, Theor. Comput. Sci..

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  Darrel Hankerson,et al.  Introduction to Information Theory and Data Compression , 2003 .

[15]  Stephanie Perkins,et al.  Binary Huffman Equivalent Codes with a Short Synchronizing Codeword , 1998, IEEE Trans. Inf. Theory.

[16]  Brockway McMillan,et al.  Two inequalities implied by unique decipherability , 1956, IRE Trans. Inf. Theory.

[17]  Marcel Paul Schützenberger,et al.  On Synchronizing Prefix Codes , 1967, Inf. Control..

[18]  Leon Gordon Kraft,et al.  A device for quantizing, grouping, and coding amplitude-modulated pulses , 1949 .