108.Application of Parallel Clustering Algorithm Based on Domain Decomposition

In recent years, with the rapid development of computer technology and the popularization and promotion of the Internet, there has been a data growth of geometrical progression, which indicates that China has ushered in a big data era. However, the increasing number of data has also brought many technical problems, and how to excavate and analyze huge data has become an urgent problem to be solved. Today, cloud computing technology has become a hot topic of research; Hadoop is an open source architecture platform in cloud computing, which has unparalleled advantages in parallel computing of massive data, but the traditional clustering algorithm adopted by Hadoop platform still follows the equal data allocation method, which will undoubtedly have a serious impact on the performance of MapReduce. To this end, this paper proposed a parallel clustering algorithm based on domain decomposition, namely DBSCAN parallel clustering algorithm, and an in-depth research on the data distribution method and parallelization of the clustering algorithm was conducted, based on which the application of DBSCAN clustering algorithm based on domain decomposition was studied.