Scale-based clustering using the radial basis function network

This paper shows how scale-based clustering can be done using the radial basis function network (RBFN), with the RBF width as the scale parameter and a dummy target as the desired output. The technique suggests the "right" scale at which the given data set should be clustered, thereby providing a solution to the problem of determining the number of RBF units and the widths required to get a good network solution. The network compares favorably with other standard techniques on benchmark clustering examples. Properties that are required of non-Gaussian basis functions, if they are to serve in alternative clustering networks, are identified. This work, on the whole, points out an important role played by the width parameter in RBFN, when observed over several scales, and provides a fundamental link to the scale space theory developed in computational vision.

[1]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[2]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[3]  Ramesh C. Jain,et al.  Behavior of Edges in Scale Space , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[5]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[6]  Robert Gilmore,et al.  Catastrophe Theory for Scientists and Engineers , 1981 .

[7]  Jianping Zhang,et al.  Selecting Typical Instances in Instance-Based Learning , 1992, ML.

[8]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[9]  Allen Klinger,et al.  PATTERNS AND SEARCH STATISTICS , 1971 .

[10]  Anil K. Jain,et al.  Validity studies in clustering methodologies , 1979, Pattern Recognit..

[11]  Brian Everitt,et al.  Cluster analysis , 1974 .

[12]  Joydeep Ghosh,et al.  The Rapid Kernel Classifier: A Link between the Self-Organizing Feature Map and the Radial Basis Function Network , 1994 .

[13]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[14]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[15]  Simon Haykin,et al.  Signal processing with radial basis function networks using expectation-maximization algorithm clustering , 1991, Optics & Photonics.

[16]  Joydeep Ghosh,et al.  A neural network based hybrid system for detection, characterization, and classification of short-duration oceanic signals , 1992 .

[17]  T. Broadbent The Convolution Transform , 1961, Nature.

[18]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Max A. Viergever,et al.  Scale and the differential structure of images , 1992, Image Vis. Comput..

[20]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[21]  Tony Lindeberg,et al.  Scale-Space for Discrete Signals , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Lutz Prechelt,et al.  A Set of Neural Network Benchmark Problems and Benchmarking Rules , 1994 .

[23]  Joydeep Ghosh,et al.  Noise sensitivity of static neural network classifiers , 1992, Defense, Security, and Sensing.

[24]  Yiu-Fai Wong,et al.  Clustering Data by Melting , 1993, Neural Computation.

[25]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[26]  Anil K. Jain,et al.  Bootstrap technique in cluster analysis , 1987, Pattern Recognit..

[27]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[28]  O. Rioul,et al.  Wavelets and signal processing , 1991, IEEE Signal Processing Magazine.

[29]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[30]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[31]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[32]  LoweDavid,et al.  Optimized Feature Extraction and the Bayes Decision in Feed-Forward Classifier Networks , 1991 .

[33]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Yung-chen Lu Singularity Theory and an Introduction to Catastrophe Theory , 1980 .

[35]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[36]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[37]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[38]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[39]  David G. Lowe,et al.  Optimized Feature Extraction and the Bayes Decision in Feed-Forward Classifier Networks , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Josef Kittler,et al.  A locally sensitive method for cluster analysis , 1976, Pattern Recognit..