Study on initial sensitivity of fuzzy clustering algorithm for big data processing

Fuzzy c-means clustering is an important branch of unsupervised classification in statistical recognition. It has a wide range of applications in the power industry. However, the iterative algorithm is sensitive to the initial value, and different initial values tend to produce different clustering results. Thus, the effect of clustering analysis cannot meet the desired requirements. In this study, we determine that the traditional clustering algorithm uses the reciprocal of the squared vector norm between the elements to describe the similarity between the elements. There are many local minima in the membership functions. Moreover, the mathematical properties of the function are not good (pathological), resulting in the complexity of the spatial structure of the cluster. In iterative searching, it is easy to fall into the local optimal point, resulting in misjudgment of the samples. In this study, we reconstruct the membership function of the exponential form. The reconstructed algorithm reduces the extremum of the clustering space and optimizes the clustering space structure. On the two-dimensional plane, the two-dimensional random number is used to test the reconstructed algorithm. The test shows that the same clustering results are always obtained from the different starting points. Furthermore, at the same clustering accuracy, the number of iterations required by the improved algorithm is less than that of the traditional algorithm, and the algorithm converges at a faster rate, and this is good for beneficial data processing.