Research on Clustering Algorithm for Large Data Sets
暂无分享,去创建一个
This paper proposes DBk-means algorithm aiming at the clustering problem for large data sets.By using Hadoop to preprocess the large original log data,the algorithm combines the superiority of k-means algorithm and DBSCAN algorithm.The experimental results of DBk-means algorithm show that this algorithm could achieve a better cluster effect than using k-means algorithm,and its accuracy could reach above 83%.