Online Multi-objective Subspace Clustering for Streaming Data

This paper develops an online subspace clustering technique which is capable of handling continuous arrival of data in a streaming manner. Subspace clustering is a technique where the subset of features that are used to represent a cluster are different for different clusters. Most of the streaming data clustering methods primarily optimize only a single objective function which limits the model in capturing only a particular shape or property. However, the simultaneous optimization of multiple objectives helps in overcoming the above mentioned limitations and enables to generate good quality clusters. Inspired by this, the developed streaming subspace clustering method optimizes multiple objectives capturing cluster compactness and feature relevancy. In this paper, we consider an evolutionary-based technique and optimize multiple objective functions simultaneously to determine the optimal subspace clusters. The generated clusters in the proposed method are allowed to contain overlapping of objects. To establish the superiority of using multiple objectives, the proposed method is evaluated on three real-life and three synthetic data sets. The results obtained by the proposed method are compared with several state-of-the-art methods and the comparative study shows the superiority of using multiple objectives in the proposed method.

[1]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[2]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[3]  Li Tu,et al.  Density-based clustering for real-time stream data , 2007, KDD '07.

[4]  Jiadong Ren,et al.  Density-Based Data Streams Clustering over Sliding Windows , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[5]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[6]  Berat A. Erol,et al.  A Novel Streaming Data Clustering Algorithm Based on Fitness Proportionate Sharing , 2019, IEEE Access.

[7]  Sriparna Saha,et al.  Improved subspace clustering algorithm using multi-objective framework and subspace optimization , 2020, Expert Syst. Appl..

[8]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[9]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[10]  Sergio Peignier Subspace Clustering on Static Datasets and Dynamic Data Streams Using Bio­-Inspired Algorithms. (Subspace Clustering sur Jeux de Données Statiques et sur Streams Dynamiques à l'Aide d'Algorithmes Bio-Inspirés) , 2017 .

[11]  Jimson Mathew,et al.  Fusion of evolvable genome structure and multi-objective optimization for subspace clustering , 2019, Pattern Recognit..

[12]  Teh Ying Wah,et al.  DENGRIS-Stream: A Density-Grid based Clustering Algorithm for Evolving Data Streams over Sliding Window , 2012 .