Extensions of vector quantization for incremental clustering

In this paper, we extend the conventional vector quantization by incorporating a vigilance parameter, which steers the tradeoff between plasticity and stability during incremental online learning. This is motivated in the adaptive resonance theory (ART) network approach and is exploited in our paper for forming a one-pass incremental and evolving variant of vector quantization. This variant can be applied for online clustering, classification and approximation tasks with an unknown number of clusters. Additionally, two novel extensions are described: one concerns the incorporation of the sphere of influence of clusters in the vector quantization learning process by selecting the 'winning cluster' based on the distances of a data point to the surface of all clusters. Another one introduces a deletion of cluster satellites and an online split-and-merge strategy: clusters are dynamically split and merged after each incremental learning step. Both strategies prevent the algorithm to generate a wrong cluster partition due to a bad a priori setting of the most essential parameter(s). The extensions will be applied to clustering of two- and high-dimensional data, within an image classification framework and for model-based fault detection based on data-driven evolving fuzzy models.

[1]  O. Nelles Nonlinear System Identification , 2001 .

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[4]  Miin-Shen Yang,et al.  A new validity index for fuzzy clustering , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[5]  Sanjay Ranka,et al.  An effic ient k-means clustering algorithm , 1997 .

[6]  Plamen Angelov,et al.  Evolving Rule-Based Models: A Tool For Design Of Flexible Adaptive Systems , 2002 .

[7]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[8]  Yeuvo Jphonen,et al.  Self-Organizing Maps , 1995 .

[9]  Hongbin Wang,et al.  Highly efficient incremental estimation of Gaussian mixture models for online data stream clustering , 2005, SPIE Defense + Commercial Sensing.

[10]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[11]  K. alik An efficient k'-means clustering algorithm , 2008 .

[12]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[13]  Johannes Gehrke,et al.  Querying and mining data streams: you only get one look a tutorial , 2002, SIGMOD '02.

[14]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[15]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[16]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[17]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[18]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[19]  Philip D. Wasserman,et al.  Advanced methods in neural computing , 1993, VNR computer library.

[20]  Robert Babuska,et al.  Constructing fuzzy models by product space clustering , 1997 .

[21]  Dimiter Driankov,et al.  Fuzzy model identification - selected approaches , 1997 .

[22]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[23]  Don-Lin Yang,et al.  An efficient Fuzzy C-Means clustering algorithm , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[24]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[25]  Roberto Cipolla,et al.  Incremental Learning of Temporally-Coherent Gaussian Mixture Models , 2005, BMVC.

[26]  Byung-In Choi,et al.  A convex cluster merging algorithm using support vector machines , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[27]  Jonathan Lawry,et al.  Soft Methodology and Random Information Systems (Advances in Soft Computing) , 2004 .

[28]  Edwin Lughofer,et al.  An approach to model-based fault detection in industrial measurement systems with application to engine test benches , 2006 .

[29]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[30]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[31]  Raghu Krishnapuram,et al.  Fitting an unknown number of lines and planes to image data through compatible cluster merging , 1992, Pattern Recognit..

[32]  Edwin Lughofer,et al.  A Comparison of Variable Selection Methods with the Main Focus on Orthogonalization , 2004 .

[33]  Weihua Li,et al.  Recursive PCA for Adaptive Process Monitoring , 1999 .

[34]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[35]  E. Lughofer,et al.  Model-based fault detection in multi-sensor measurement systems , 2004, 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791).

[36]  Miin-Shen Yang,et al.  A cluster validity index for fuzzy clustering , 2005, Pattern Recognit. Lett..

[37]  Donald Gustafson,et al.  Fuzzy clustering with a fuzzy covariance matrix , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[38]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[39]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[40]  Edwin Lughofer,et al.  FLEXFIS: A Variant for Incremental Learning of Takagi-Sugeno Fuzzy Systems , 2005, The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05..

[41]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[42]  Eyke Hüllermeier,et al.  Online clustering of parallel data streams , 2006, Data Knowl. Eng..

[43]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[44]  Stephen Grossberg,et al.  Adaptive resonance theory: ART , 1998, An Introduction to Neural Networks.