A density-based competitive data stream clustering network with self-adaptive distance metric

Data stream clustering is a branch of clustering where patterns are processed as an ordered sequence. In this paper, we propose an unsupervised learning neural network named Density Based Self Organizing Incremental Neural Network(DenSOINN) for data stream clustering tasks. DenSOINN is a self organizing competitive network that grows incrementally to learn suitable nodes to fit the distribution of learning data, combining online unsupervised learning and topology learning by means of competitive Hebbian learning rule. By adopting a density-based clustering mechanism, DenSOINN discovers arbitrarily shaped clusters and diminishes the negative effect of noise. In addition, we adopt a self-adaptive distance framework to obtain good performance for learning unnormalized input data. Experiments show that the DenSOINN can achieve high standard performance comparing to state-of-the-art methods.

[1]  Shen Furao,et al.  An enhanced self-organizing incremental neural network for online unsupervised learning , 2007, Neural Networks.

[2]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[3]  Mehran Sahami,et al.  Text Mining: Classification, Clustering, and Applications , 2009 .

[4]  Zexuan Zhu,et al.  Computational intelligence in optical remote sensing image processing , 2018, Appl. Soft Comput..

[5]  Xiong Xiao,et al.  A Load-Balancing Self-Organizing Incremental Neural Network , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[7]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Data stream clustering: A survey , 2013, CSUR.

[8]  Jin-Yin Chen,et al.  A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data , 2016, Inf. Sci..

[9]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[10]  Joshua Zhexue Huang,et al.  Incremental density-based ensemble clustering over evolving data streams , 2016, Neurocomputing.

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  Hongjie Jia,et al.  Research on data stream clustering algorithms , 2013, Artificial Intelligence Review.

[13]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[14]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[15]  Liangpei Zhang,et al.  Adaptive Multiobjective Memetic Fuzzy Clustering Algorithm for Remote Sensing Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Richard C. T. Lee Clustering Analysis and Its Applications , 1981 .

[17]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[18]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[19]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[20]  Shen Furao,et al.  An incremental network for on-line unsupervised classification and topology learning , 2006, Neural Networks.

[21]  Christian Sohler,et al.  StreamKM++: A clustering algorithm for data streams , 2010, JEAL.

[22]  Shen Furao,et al.  A fast nearest neighbor classifier based on self-organizing incremental neural network , 2008, Neural Networks.

[23]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[24]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[25]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[26]  Plamen Angelov,et al.  Fully online clustering of evolving data streams into arbitrarily shaped clusters , 2017, Inf. Sci..

[27]  Michèle Sebag,et al.  Data Stream Clustering With Affinity Propagation , 2014, IEEE Transactions on Knowledge and Data Engineering.

[28]  Mustapha Lebbah,et al.  A new Growing Neural Gas for clustering data streams , 2016, Neural Networks.

[29]  Robert M. Haralick,et al.  Feature normalization and likelihood-based similarity measures for image retrieval , 2001, Pattern Recognit. Lett..