Finding natural clusters having minimum description length

A two-step procedure that finds natural clusters in geometric point data is described. The first step computes a hierarchical cluster tree minimizing an entropy objective function. The second step recursively explores the tree for a level clustering having minimum description length. Together, these two steps find natural clusters without requiring a user to specify threshold parameters or so-called magic numbers. In particular, the method automatically determines the number of clusters in the input data. The first step exploits a new hierarchical clustering procedure called numerical iterative hierarchical clustering (NIHC). The output of NIHC is a cluster tree. The second step in the procedure searches the tree for a minimum-description-length (MDL) level clustering. The MDL formulation, equivalent to maximizing the posterior probability, is suited to the clustering problem because it defines a natural prior distribution.<<ETX>>