A Nonparametric Valley-Seeking Technique for Cluster Analysis

The problem of clustering multivariate observations is viewed as the replacement of a set of vectors with a set of labels and representative vectors. A general criterion for clustering is derived as a measure of representation error. Some special cases are derived by simplifying the general criterion. A general algorithm for finding the optimum classification with respect to a given criterion is derived. For a particular case, the algorithm reduces to a repeated application of a straightforward decision rule which behaves as a valley-seeking technique. Asymptotic properties of the procedure are developed. Numerical examples are presented for the finite sample case.