On the minimum description length principle for sources with piecewise constant parameters

Universal lossless coding in the presence of finitely many abrupt changes in the statistics of the source, at unknown points, is investigated. The minimum description length (MDL) principle is derived for this setting. In particular, it is shown that, for any uniquely decipherable code, for almost every combination of statistical parameter vectors governing each segment, and for almost every vector of transition instants, the minimum achievable redundancy is composed from 0.5 log n/n bits for each unknown segmental parameter and log n/n bits for each transition, where n is the length of the input string. This redundancy is shown to be attainable by a strongly sequential universal encoder, i.e., an encoder that does not utilize the knowledge of a prescribed value of n. >

[1]  Giuseppe Longo,et al.  The error exponent for the noiseless encoding of finite ergodic Markov sources , 1981, IEEE Trans. Inf. Theory.

[2]  Jay G. Wilpon,et al.  Application of hidden Markov models to automatic speech endpoint detection , 1987 .

[3]  Torbjørn Svendsen,et al.  On the automatic segmentation of speech signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Frederick Jelinek,et al.  Variable-length encoding of fixed-rate Markov sources for fixed-rate channels , 1974, IEEE Trans. Inf. Theory.

[5]  Moshe Porat,et al.  Planar curve segmentation for recognition of partially occluded shapes , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[6]  Lee D. Davisson,et al.  Minimax noiseless universal coding for Markov sources , 1983, IEEE Trans. Inf. Theory.

[7]  Alex Pentland,et al.  Part Segmentation for Object Recognition , 1989, Neural Computation.

[8]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[9]  Anselm Blumer Minimax universal noiseless coding for unifilar and Markov sources , 1987, IEEE Trans. Inf. Theory.

[10]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[11]  Glen G. Langdon,et al.  Universal modeling and coding , 1981, IEEE Trans. Inf. Theory.

[12]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[13]  Robert C. Bolles,et al.  Perceptual Organization and Curve Partitioning , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Abraham Lempel,et al.  A sequential algorithm for the universal coding of finite memory sources , 1992, IEEE Trans. Inf. Theory.

[15]  Robert G. Gallager,et al.  Variations on a theme by Huffman , 1978, IEEE Trans. Inf. Theory.

[16]  Jeffrey Scott Vitter,et al.  Design and analysis of dynamic Huffman codes , 1987, JACM.

[17]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[18]  Donald E. Knuth,et al.  Dynamic Huffman Coding , 1985, J. Algorithms.

[19]  Hidetoshi Yokoo,et al.  An improvement of dynamic Huffman coding with a simple repetition finder , 1991, IEEE Trans. Commun..

[20]  Frank K. Soong,et al.  A segment model based approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[21]  Anil K. Jain,et al.  MRF model-based algorithms for image segmentation , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[22]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[23]  Aaron D. Wyner,et al.  Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression , 1989, IEEE Trans. Inf. Theory.

[24]  Frans M. J. Willems,et al.  Variable to fixed-length codes for Markov sources , 1987, IEEE Trans. Inf. Theory.