Structural Editing by a Point Density Function

A new algorithm is presented for pattern recognition by clustering. The algorithm is called structural editing by a point density function, or STEP. STEP uses a minimum spanning tree to retain the interpoint structure among the elements of an unclassified training set. The tree is pruned or edited to form clusters based on information provided by a point density function (PDF) estimate. STEP has the capability of detecting clusters of arbitrary shape in the presence of intercluster stray points or outliers. A cluster is not required to correspond to a unimodal PDF estimate. Monte Carlo simulations indicate that STEP performs as well as, or better than, a nearest neighbor classifier which requires a classified training set. A new algorithm for recursively constructing the minimum spanning tree is presented which is computationally simpler than conventional algorithms in many practical applications. Results from applying STEP to the mass screening of breast thermograms are discussed.

[1]  B. Chandrasekaran,et al.  On dimensionality and sample size in statistical pattern classification , 1971, Pattern Recognit..

[2]  Ivan Tomek,et al.  A Generalization of the k-NN Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[4]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[5]  J. Gershon-Cohen,et al.  Thermography in detection of early breast cancer , 1970, Cancer.

[6]  Laveen N. Kanal,et al.  Patterns in pattern recognition: 1968-1974 , 1974, IEEE Trans. Inf. Theory.

[7]  Martin D. Levine,et al.  Feature extraction: A survey , 1969 .

[8]  Martin D. Levine,et al.  An Algorithm for Detecting Unimodal Fuzzy Sets and Its Application as a Clustering Technique , 1970, IEEE Transactions on Computers.

[9]  T. Wagner,et al.  Asymptotically optimal discriminant functions for pattern classification , 1969, IEEE Trans. Inf. Theory.

[10]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[11]  J. V. Ryzin,et al.  ON STRONG CONSISTENCY OF DENSITY ESTIMATES , 1969 .

[12]  Bruce A. Eisenstein,et al.  Signal processing for feature extraction and pattern recognition , 1976, ICASSP.

[13]  T. Cover A HIERARCHY OF PROBABILITY DENSITY FUNCTION ESTIMATES , 1972 .

[14]  R. Prim Shortest connection networks and some generalizations , 1957 .

[15]  Keinosuke Fukunaga,et al.  A Graph-Theoretic Approach to Nonparametric Cluster Analysis , 1976, IEEE Transactions on Computers.

[16]  R. G. Carpenter,et al.  The Cumulative Construction of Minimum Spanning Trees , 1971 .

[17]  Bruce A. Eisenstein,et al.  A Declustering Criterion for Feature Extraction in Pattern Recognition , 1978, IEEE Transactions on Computers.