Minimum Message Length Clustering of Spatially-Correlated Data with Varying Inter-Class Penalties

We present here some applications of the minimum message length (MML) principle to spatially correlated data. Discrete valued Markov random fields are used to model spatial correlation. The models for spatial correlation used here are a generalisation of the model used in (Wallace 1998) for unsupervised classification of spatially correlated data (such as image segmentation). We discuss how our work can be applied to that type of unsupervised classification. We now make the following three new contributions. First, the rectangular grid used in (Wallace 1998)is generalised to an arbitrary graph of arbitrary edge distances. Secondly, we refine (Wallace 1998) slightly by including a discarded message length term important to small data sets and to a simpler problem presented here. Finally, we show how the minimum message length (MML) principle can be used to test for the presence of spatial correlation and how it can be used to choose between models of varying complexity to infer details of the nature of the spatial correlation.

[1]  G. Grimmett A THEOREM ABOUT RANDOM FIELDS , 1973 .

[2]  C. S. Wallace,et al.  Intrinsic Classification of Spatially Correlated Data , 1998, Comput. J..

[3]  Lloyd Allison,et al.  MML Markov classification of sequential data , 1999, Stat. Comput..

[4]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[6]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[7]  David L. Dowe,et al.  Minimum message length and generalized Bayesian nets with asymmetric languages , 2005 .

[8]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[9]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[10]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[11]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[12]  Peter Grünwald,et al.  Invited review of the book Statistical and Inductive Inference by Minimum Message Length , 2006 .

[13]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[14]  David L. Dowe,et al.  Bayes not Bust! Why Simplicity is no Problem for Bayesians1 , 2007, The British Journal for the Philosophy of Science.

[15]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[16]  David L. Dowe,et al.  MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions , 2000, Stat. Comput..

[17]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[18]  C. S. Wallace,et al.  Resolving the Neyman-Scott problem by minimum message length , 1997 .

[19]  David L. Dowe,et al.  Message Length as an Effective Ockham's Razor in Decision Tree Induction , 2001, International Conference on Artificial Intelligence and Statistics.