On Non-sequential Context Modeling with Application to Executable Data Compression

The sequential context modeling framework is generalized to a non-sequential one by context relaxation from consecutive suffix of the subsequences of symbols to the permutation of the preceding symbols as result of considering complex context structures in such sources as video and program binaries. Context weighting tree is also extended to a series of context trees which are built according to the "model tree", in which the descendent relationship in the formation of non-sequential context sets is described. Model redundancy and maximum a posteriori model in the framework are discussed and compared. A decision method based on the greedy algorithm is proposed to customize sets of models fitting the concrete sources. Brief description of application to executable data files incorporating with the semantics and syntax constraints are given and experiment are made accordingly as a validation.

[1]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[2]  JORMA RISSANEN,et al.  A universal data compression system , 1983, IEEE Trans. Inf. Theory.

[3]  Ian H. Witten,et al.  Data Compression Using Adaptive Coding and Partial String Matching , 1984, IEEE Trans. Commun..

[4]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[5]  Alistair Moffat,et al.  Implementing the PPM data compression scheme , 1990, IEEE Trans. Commun..

[6]  Meir Feder,et al.  A universal finite memory source , 1995, IEEE Trans. Inf. Theory.

[7]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[8]  Darko Kirovski,et al.  PPMexe: PPM for compressing software , 2002, Proceedings DCC 2002. Data Compression Conference.

[9]  Y. Shtarkov,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[10]  Tsachy Weissman,et al.  Multi-directional context sets with applications to universal denoising and compression , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[11]  Matthew V. Mahoney,et al.  Adaptive weighing of context models for lossless data compression , 2005 .

[12]  Darko Kirovski,et al.  PPMexe: Program compression , 2007, TOPL.