The detection and identification of the bad data of the power system plays an important role in dispatching personnel to grasp the running status of the power grid in real time. In order to overcome negative effects of random selection of clustering initial values of traditional GSA bad data identification algorithm on identification precision and computation rate, this paper propose an optimized GSA algorithm based on area density statistics method. This algorithm by computing the area density of each cluster object to select k points that are farthest from each other and are at the highest area density as the initial cluster center. The experimental results show that the optimized GSA algorithm improves the accuracy of the degree of clustering dispersion and the recognition accuracy of the bad data. At the same time, the algorithm greatly reduces the computational complexity of iterative computation, improves the computing speed and saves a lot of computing time. In the case of huge system and large amount of data, this method is a rapid and efficient algorithm, and has potential of good application.
[1]
Zhai Dong-ha.
K-means text clustering algorithm based on initial cluster centers selection according to maximum distance
,
2014
.
[2]
L Zhao,et al.
Improved K-Means Algorithm Based Analysis on Massive Data of Intelligent Power Utilization
,
2014
.
[3]
Jonathon A. Chambers,et al.
Active source selection using gap statistics for underdetermined blind source separation
,
2003,
Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..
[4]
Renato Cordeiro de Amorim,et al.
Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering
,
2012,
Pattern Recognit..