A Refined Rough K-Means Clustering Algorithm based on Minimizing the Effect of Local Outlier Objects to Improve Overlapping Detection

In order to improve the quality of overlapping detection, Rough K-Means (RKM) was proposed as the first kind of rough clustering algorithm. It was found that this recent RKM algorithm known as π RKM is the most powerful and effective version in which there is an increase in the number of objects that are correctly clustered and a decrease in the number objects that are incorrectly clustered compared to the issues which the previous RKM had. However, there are challenges associated with the clustering process which uses RKM as a result of the difficulty in establishing a standard measure for reducing the effect of local outlier objects on a means function. Therefore, the RKM algorithm is refined in this study to address the problem. Through this study we contribute two components. Firstly, we intend to employ the use of Local Outlier Factor (LOF) technique for the discrimination of a number of objects as outliers and secondly, we propose to reduce the effect of local outliers on means function by using a weight. The result of the experiments which were performed through the use of synthetic and real life datasets prove that there is an improvement in the quality of overlapping detection when compared to recent versions.

[1]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[2]  Georg Peters,et al.  Is there any need for rough clustering? , 2015, Pattern Recognit. Lett..

[3]  Jian Yu,et al.  Partitive clustering (K‐means family) , 2012, WIREs Data Mining Knowl. Discov..

[4]  Azuraliza Abu Bakar,et al.  Rough K-means Outlier Factor Based on Entropy Computation , 2014 .

[5]  Theresa Beaubouef,et al.  Rough Sets , 2019, Lecture Notes in Computer Science.

[6]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[7]  Georg Peters,et al.  Some refinements of rough k-means clustering , 2006, Pattern Recognit..

[8]  Richard Weber,et al.  Evolutionary Rough k-Medoid Clustering , 2008, Trans. Rough Sets.

[9]  Georg Peters,et al.  Rough clustering utilizing the principle of indifference , 2014, Inf. Sci..

[10]  T. Velmurugan,et al.  Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points , 2010 .

[11]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[12]  Pawan Lingras,et al.  Evolutionary Rough K-Means Clustering , 2009, RSKT.

[13]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[14]  Sankar K. Pal,et al.  Perception and Machine Intelligence , 2012, Lecture Notes in Computer Science.

[15]  Pradipta Maji,et al.  Rough-Fuzzy C-Means for Clustering Microarray Gene Expression Data , 2012, PerMIn.

[16]  J. Bezdek,et al.  Fuzzy partitions and relations; an axiomatic basis for clustering , 1978 .

[17]  Sankar K. Pal,et al.  Fuzzy sets and decisionmaking approaches in vowel and speaker recognition , 1977 .

[18]  Pawan Lingras,et al.  Interval Set Clustering of Web Users with Rough K-Means , 2004, Journal of Intelligent Information Systems.

[19]  Witold Pedrycz,et al.  Rough–Fuzzy Collaborative Clustering , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Pawan Lingras,et al.  Rough clustering , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[21]  Hans-Peter Kriegel,et al.  OPTICS-OF: Identifying Local Outliers , 1999, PKDD.

[22]  Yiyu Yao,et al.  Rough Sets: Selected Methods and Applications in Management and Engineering , 2012, Advanced Information and Knowledge Processing.

[23]  Sushmita Mitra An evolutionary rough partitive clustering , 2004, Pattern Recognit. Lett..

[24]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[25]  Sankar K. Pal,et al.  RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets , 2007, Fundam. Informaticae.

[26]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[27]  Richard Weber,et al.  Soft clustering - Fuzzy and rough approaches and their extensions and derivatives , 2013, Int. J. Approx. Reason..

[28]  Pawan Lingras,et al.  Analysis of User-Weighted π Rough k-Means , 2014, RSKT.

[29]  Georg Peters,et al.  Assessing Rough Classifiers , 2015, Fundam. Informaticae.