MDL estimation for small sample sizes and its application to segmenting binary strings

Minimum Description Length (MDL) estimation has proven itself of major importance in a large number of applications many of which are in the fields of computer vision and pattern recognition. A problem is encountered in applying the associated formulas, however, especially those associated with model cost. This is because most of these are asymptotic forms appropriate only for large sample sizes. J. Rissanen has recently derived sharper code-length formulas valid for much smaller sample sizes. Because of the importance of these results, it is our intent here to present a tutorial description of them. In keeping with this goal we have chosen a simple application whose relative tractability allows it to be explored more deeply than most problems: the segmentation of binary strings based on a piecewise Bernoulli assumption. By that we mean that the strings are assumed to be divided into substrings, the bits of which are assumed to have been generated by a single (within a substring) Bernoulli source.

[1]  Trevor J. Hall,et al.  Optimal Network Construction by Minimum Description Length , 1993, Neural Computation.

[2]  Kanungo,et al.  A fast algorithm for MDL-based multi-band image segmentation , 1994, CVPR 1994.

[3]  Jorma Rissanen,et al.  Unsupervised Classification with Stochastic Complexity , 1994 .

[4]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[5]  Alex Pentland,et al.  Part Segmentation for Object Recognition , 1989, Neural Computation.

[6]  Alan L. Yuille,et al.  Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Harpreet S. Sawhney,et al.  Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  Tapas Kanungo,et al.  A fast algorithm for MDL-based multi-band image segmentation , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[9]  R. Zemel A minimum description length framework for unsupervised learning , 1994 .

[10]  Alex Pentland,et al.  Cooperative Robust Estimation Using Layers of Support , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Wayne Niblack,et al.  Detecting parameterized curve segments using MDL and the Hough transform , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Kenneth Keeler,et al.  Map representations and coding-based priors for segmentation , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[14]  Wayne Niblack,et al.  A modeling approach to feature selection , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[15]  Jakub Segen Unsupervised Feature Selection For Object Recognition , 1983, Optics & Photonics.

[16]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[17]  Wayne Niblack,et al.  Feature selection with stochastic complexity , 1989, Proceedings CVPR '89: IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Jorma Rissanen,et al.  MDL-Based Decision Tree Pruning , 1995, KDD.

[19]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[20]  Byron Dom,et al.  2n-Tree Classifiers for Realtime Image Segmentation , 1990, MVA.