A test for spatial homogeneity in cluster analysis

This paper proposes a measure of spatial homogeneity for sets of d-dimensional points based on nearest neighbor distances. Tests for spatial uniformity are examined which assess the tendency of the entire data set to aggregate and evaluate the character of individual clusters. The sizes and powers of three statistical tests of uniformity against aggregation, regularity, and unimodality are studied to determine robustness. The paper also studies the effects of normalization and incorrect prior information. A “percentile frame” sampling procedure is proposed that does not require a sampling window but is superior to a toroidal frame and to buffer zone sampling in particular situations. Examples test two data sets for homogeneity and search the results of a hierarchical clustering for homogeneous clusters.

[1]  T. Cox,et al.  A conditioned distance ratio method for analyzing spatial patterns , 1976 .

[2]  Anil K. Jain,et al.  Clustering Methodologies in Exploratory Data Analysis , 1980, Adv. Comput..

[3]  B. Everitt A Monte Carlo Investigation Of The Likelihood Ratio Test For The Number Of Components In A Mixture Of Normal Distributions. , 1981, Multivariate behavioral research.

[4]  J. G. Skellam,et al.  A New Method for determining the Type of Distribution of Plant Individuals , 1954 .

[5]  Anil K. Jain,et al.  Measurement of Clustering Tendency , 1982 .

[6]  Guangzhou Zeng,et al.  A test for spatial randomness based on k-NN distances , 1985, Pattern Recognit. Lett..

[7]  J. Besag,et al.  Statistical Analysis of Spatial Point Patterns by Means of Distance Methods , 1976 .

[8]  P. J. Clark,et al.  Distance to Nearest Neighbor as a Measure of Spatial Relationships in Populations , 1954 .

[9]  B. Ripley Tests of 'Randomness' for Spatial Point Patterns , 1979 .

[10]  J. Hammersley The Distribution of Distance in a Hypersphere , 1950 .

[11]  F. Marriott Practical problems in a method of cluster analysis. , 1971, Biometrics.

[12]  Peter J. Diggle,et al.  On parameter estimation and goodness-of-fit testing for spatial point patterns , 1979 .

[13]  B. Ripley Modelling Spatial Patterns , 1977 .

[14]  V. Alagar The distribution of the distance between random points , 1976, Journal of Applied Probability.

[15]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[16]  Anil K. Jain,et al.  A test of randomness based on the minimal spanning tree , 1983, Pattern Recognit. Lett..

[17]  K. J. Worsley A Non-Parametric Extension of a Cluster Analysis Method by Scott and Knott , 1977 .

[18]  Anil K. Jain,et al.  Testing for Uniformity in Multidimensional Data , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  M. Knott,et al.  An Approximate Test for Use with Aid , 1976 .

[20]  P. Sneath A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap , 1977 .

[21]  F. James Rohlf,et al.  A RANDOMIZATION TEST OF THE NON SPECIFICITY HYPOTHESIS IN NUMERICAL TAXONOMY , 1965 .

[22]  Anil K. Jain,et al.  Validity studies in clustering methodologies , 1979, Pattern Recognit..

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Erdal Panayirci,et al.  A test for multidimensional clustering tendency , 1983, Pattern Recognit..

[25]  G. W. Milligan,et al.  A monte carlo study of thirty internal criterion measures for cluster analysis , 1981 .

[26]  Peter J. Diggle,et al.  Statistical analysis of spatial point patterns , 1983 .

[27]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Peter J. Diggle,et al.  Simple Monte Carlo Tests for Spatial Pattern , 1977 .

[29]  J. Hartigan,et al.  Percentage Points of a Test for Clusters , 1969 .

[30]  Guangzhou Zeng,et al.  A comparison of tests for randomness , 1985, Pattern Recognit..

[31]  P. Diggle Robust density estimation using distance methods , 1975 .