An incremental network for on-line unsupervised classification and topology learning

This paper presents an on-line unsupervised learning mechanism for unlabeled data that are polluted by noise. Using a similarity threshold-based and a local error-based insertion criterion, the system is able to grow incrementally and to accommodate input patterns of on-line non-stationary data distribution. A definition of a utility parameter, the error-radius, allows this system to learn the number of nodes needed to solve a task. The use of a new technique for removing nodes in low probability density regions can separate clusters with low-density overlaps and dynamically eliminate noise in the input data. The design of two-layer neural network enables this system to represent the topological structure of unsupervised on-line data, report the reasonable number of clusters, and give typical prototype patterns of every cluster without prior conditions such as a suitable number of nodes or a good initial codebook.

[1]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[2]  Bernd Fritzke,et al.  Growing cell structures--A self-organizing network for unsupervised and supervised learning , 1994, Neural Networks.

[3]  M. Narasimha Murty,et al.  A hybrid clustering procedure for concentric and chain-like clusters , 1981, International Journal of Computer & Information Sciences.

[4]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[5]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[6]  CHEE PENG LIM,et al.  An Incremental Adaptive Network for On-line Supervised Learning and Probability Estimation , 1997, Neural Networks.

[7]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[8]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[9]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[10]  Thomas Villmann,et al.  Topology preservation in self-organizing feature maps: exact definition and measurement , 1997, IEEE Trans. Neural Networks.

[11]  Bernd Fritzke,et al.  A Self-Organizing Network that Can Follow Non-stationary Distributions , 1997, ICANN.

[12]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[13]  Giuseppe Patanè,et al.  The enhanced LBG algorithm , 2001, Neural Networks.

[14]  Gerald Sommer,et al.  Dynamic Cell Structure Learns Perfectly Topology Preserving Map , 1995, Neural Computation.

[15]  Ming-Syan Chen,et al.  A robust and efficient clustering algorithm based on cohesion self-merging , 2002, KDD.

[16]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[17]  Fred Henrik Hamker,et al.  Life-long learning Cell Structures--continuously learning without catastrophic interference , 2001, Neural Networks.

[18]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[19]  M. T. Wasan Stochastic Approximation , 1969 .

[20]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[21]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .

[22]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1988, Computer.

[23]  Thomas Martinetz,et al.  Topology representing networks , 1994, Neural Networks.

[24]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[25]  C. Malsburg,et al.  How patterned neural connections can be set up by self-organization , 1976, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[26]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[27]  Frank-Michael Schleif,et al.  Supervised Neural Gas and Relevance Learning in Learning Vector Quantization , 2003 .

[28]  Yen-Jen Oyang,et al.  A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory , 2001, PKDD.

[29]  David G. Stork,et al.  Pattern Classification , 1973 .

[30]  T. Martínez,et al.  Competitive Hebbian Learning Rule Forms Perfectly Topology Preserving Maps , 1993 .

[31]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.