Two stage outliers detection algorithm based on clustering division

Subjecting to the restrictions of the global threshold,the distance-based outlier detection algorithm can only detect global outliers.This paper proposed two-stage outlier detection algorithm based on clustering.First,it iterated to get the value k that the K-means required based on agglomerative hierarchical clustering.Then,it divided the data set into a number of micro-clustering by the K-means method.To improve the efficiency of mining,it proposed the clustering filter mechanism based on information entropy to determine whether the micro-clustering contained outliers.Finally,it used distance-based approach to detect local outliers from the micro-clustering with outliers.Experimental results show that the proposed algorithm has high efficiency,high precision and low time complexity.