Improving PPM with Dynamic Parameter Updates

This article makes several improvements to the classic PPM algorithm, resulting in a new algorithm with superior compression effectiveness on human text. The key differences of our algorithm to classic PPM are that (A) rather than the original escape mechanism, we use a generalised blending method with explicit hyper-parameters that control the way symbol counts are combined to form predictions, (B) different hyper-parameters are used for classes of different contexts, and (C) these hyper-parameters are updated dynamically using gradient information. The resulting algorithm (PPM-DP) compresses human text better than all currently published variants of PPM, CTW, DMC, LZ, CSE and BWT, with runtime only slightly slower than classic PPM.

[1]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[2]  Matthew V. Mahoney The PAQ1 Data Compression Program , 2002 .

[3]  Alistair Moffat,et al.  Implementing the PPM data compression scheme , 1990, IEEE Trans. Commun..

[4]  Christian Steinruecken,et al.  Lossless data compression , 2015 .

[5]  Y. Shtarkov,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[6]  Ian H. Witten,et al.  Data Compression Using Adaptive Coding and Partial String Matching , 1984, IEEE Trans. Commun..

[7]  Yee Whye Teh,et al.  A stochastic memoizer for sequence data , 2009, ICML '09.

[8]  Mark Weiser,et al.  Source Code , 1987, Computer.

[9]  Matthew V. Mahoney,et al.  Fast Text Compression with Neural Networks , 2000, FLAIRS Conference.

[10]  Yee Whye Teh,et al.  A Bayesian Interpretation of Interpolated Kneser-Ney , 2006 .

[11]  Suzanne Bunton,et al.  Semantically Motivated Improvements for PPM Variants , 1997, Comput. J..

[12]  Matthew V. Mahoney,et al.  Adaptive weighing of context models for lossless data compression , 2005 .

[13]  Yee Whye Teh,et al.  Improvements to the Sequence Memoizer , 2010, NIPS.

[14]  Frank D. Wood,et al.  Deplump for Streaming Data , 2011, 2011 Data Compression Conference.

[15]  Paul G. Howard,et al.  The design and analysis of efficient lossless data compression systems , 1993 .

[16]  Dmitry A. Shkarin,et al.  PPM: one step to practicality , 2002, Proceedings DCC 2002. Data Compression Conference.

[17]  Yee Whye Teh,et al.  Lossless Compression Based on the Sequence Memoizer , 2010, 2010 Data Compression Conference.

[18]  Frans M. J. Willems,et al.  The Context-Tree Weighting Method : Extensions , 1998, IEEE Trans. Inf. Theory.

[19]  Dmitry A. Shkarin Improving the Efficiency of the PPM Algorithm , 2001, Probl. Inf. Transm..

[20]  Frans M. J. Willems,et al.  Context Tree Weighting : A Sequential Universal Source Coding Procedure for Fsmx Sources , 1993, Proceedings. IEEE International Symposium on Information Theory.

[21]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[22]  R. Nigel Horspool,et al.  Data Compression Using Dynamic Markov Modelling , 1987, Comput. J..

[23]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[24]  Vincent Beaudoin,et al.  Lossless Data Compression via Substring Enumeration , 2010, 2010 Data Compression Conference.

[25]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.