DEBC-GM: Denclue Based Gaussian Mixture Approach for Big Data Clustering

In current digitization world, data are growing with high density rapid rate. Therefore, it is necessary to manage the complexity of data in an efficient way with less effort. In order to handle the complex data, a concept called clustering has been introduced to help the user to recognize the natural grouping of a dataset. The objective of the data clustering is to find the groups of identical substances in a dataset while preserving them separately from the noisy points. Different discrete clustering algorithms have been proposed to manage the consequences of the large datasets. To rectify the noisy instances of large datasets, in this paper, a clustering methodology named DENCLUE-GM is introduced. This methodology is framed on the connectivity and density functions. Preliminary results display that the clustering veracity rate, clustering quality, and the global search efficiency of improved algorithm is greater than classical clustering algorithms. On the other hand, DENCLUE-GM methodology is compared with DBSCAN approach in the aspects of accuracy, better memory utilization, and other quality based measures

[1]  Siti Zaiton Mohd Hashim,et al.  Robust Local Triangular Kernel density-based clustering for high-dimensional data , 2013, 2013 5th International Conference on Computer Science and Information Technology.

[2]  Siti Zaiton Mohd Hashim,et al.  Cluster Analysis on High-Dimensional Data: A Comparison of Density-based Clustering Algorithms , 2013 .

[3]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[4]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[5]  E. Levina,et al.  Community extraction for social networks , 2010, Proceedings of the National Academy of Sciences.

[6]  Daniel A. Keim,et al.  HD-Eye: Visual Mining of High-Dimensional Data , 1999, IEEE Computer Graphics and Applications.

[7]  Junjie Wu,et al.  Advances in K-means clustering: a data mining thinking , 2012 .

[8]  Jignesh M. Patel,et al.  Big data and its technical challenges , 2014, CACM.

[9]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[10]  Young-Ho Park,et al.  A Survey on Density-Based Clustering Algorithms , 2014 .

[11]  LI De-yi,et al.  Hierarchical Clustering based on Kernel Density Estimation , 2004 .

[12]  Cunhua Li,et al.  DENCLUE-M: Boosting DENCLUE Algorithm by Mean Approximation on Grids , 2003, WAIM.

[13]  Alexander Hinneburg,et al.  DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation , 2007, IDA.

[14]  Jeff Kalibjian "Big Data" Management and Security Application to Telemetry Data Products , 2013 .

[15]  Yin Jian,et al.  A new clustering algorithm based on KNN and DENCLUE , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[16]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[17]  Hajar Rehioui,et al.  An improvement of DENCLUE algorithm for the data clustering , 2015, 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA).

[18]  Daniel A. Keim,et al.  A General Approach to Clustering in Large Databases with Noise , 2003, Knowledge and Information Systems.

[19]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[20]  M. Parimala,et al.  A Survey on Density Based Clustering Algorithms for Mining Large Spatial Databases , 2011 .

[21]  Daniel A. Keim,et al.  Clustering techniques for large data sets—from the past to the future , 1999, KDD '99.

[22]  Hajar Rehioui,et al.  The 7 th International Conference on Ambient Systems , Networks and Technologies ( ANT 2016 ) DENCLUE-IM : A New Approach for Big Data Clustering , 2016 .

[23]  Glory H. Shah,et al.  An Empirical Evaluation of Density-Based Clustering Techniques , 2012 .

[24]  Matthew King,et al.  Density based fuzzy C , 2006, Eur. J. Oper. Res..