Outlier Detection using Improved Genetic K-means

In this article, we present an algorithm that provides outlier detection and data clustering simultaneously. The algorithmimprovesthe estimation of centroids of the generative distribution during the process of clustering and outlier discovery. The proposed algorithm consists of two stages. The first stage consists of improved genetic k-means algorithm (IGK) process, while the second stage iteratively removes the vectors which are far from their cluster centroids.

[1]  Sukumar Nandi,et al.  An Outlier Detection Method Based on Clustering , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[2]  P. Murugavel,et al.  Improved Hybrid Clustering and Distance-based Technique for Outlier Removal , 2011 .

[3]  Hongxing He,et al.  A comparative study of RNN for outlier detection in data mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[5]  Amitava Karmaker,et al.  Outlier Detection in Spatial Databases Using Clustering Data Mining , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[6]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[7]  Carlos Soares,et al.  Outlier Detection using Clustering Methods: a data cleaning application , 2004 .

[8]  Chang-Tien Lu,et al.  Outlier Detection , 2008, Encyclopedia of GIS.

[9]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[10]  Shian-Shyong Tseng,et al.  Two-phase clustering process for outliers detection , 2001, Pattern Recognit. Lett..

[11]  Tomi Kinnunen,et al.  Improving K-Means by Outlier Removal , 2005, SCIA.

[12]  Junliang Chen,et al.  ODDC: Outlier Detection Using Distance Distribution Clustering , 2007, PAKDD Workshops.

[13]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[14]  Pasi Fränti,et al.  Outlier detection using k-nearest neighbour graph , 2004, ICPR 2004.

[15]  Pasi Fränti,et al.  Outlier Detection Using k-Nearest Neighbour Graph , 2004, ICPR.

[16]  O. Virmajoki,et al.  PAIRWISE NEAREST NEIGHBOR METHOD REVISITED , 2004 .

[17]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[18]  Moh'd Belal Al-Zoubi,et al.  New outlier detection method based on fuzzy clustering , 2010 .