Design and Implementation of an Improved DBSCAN Algorithm

DBSCAN algorithm is a density-based clustering algorithm, and it has been widely used in data clustering. DBSCAN algorithm needs to calculate the distance between each object and all the other objects when searching for core objects. The process leads to high computational overhead. This paper proposes an improved algorithm which takes no account of distant neighbors of an object and computes only the distances between the object and its nearby neighbors. This method reduces the distance measurements when searching for core objects, and the computational cost is obviously reduced. Four standard data sets of UCI are selected to verify the clustering process and algorithm performance of the improved DBSCAN algorithm. Experimental results show that the improved algorithm reduces the computational cost tremendously, while maintaining high clustering accuracy.

[1]  Zhou Shui A FAST DENSITY BASED CLUSTERING ALGORITHM , 2000 .

[2]  Mengmeng Wang,et al.  An improved density peaks-based clustering method for social circle discovery in social networks , 2016, Neurocomputing.

[3]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[4]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[5]  Bing Liu,et al.  A Fast Density-Based Clustering Algorithm for Large Databases , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[6]  Kai Ming Ting,et al.  Density-ratio based clustering for discovering clusters with varying densities , 2016, Pattern Recognit..

[7]  Manoj Singh,et al.  Outlier detection using divide-and-conquer strategy in density based clustering , 2016, 2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE).

[8]  Dechang Pi,et al.  Unifying Density-Based Clustering and Outlier Detection , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.

[9]  Selim Mimaroglu,et al.  Improving DBSCAN's execution time by using a pruning technique on bit vectors , 2011, Pattern Recognit. Lett..