Rough Graded Possibilistic Meta: Outlier Detection in Granular Clustering

Cluster analysis and outlier detection are strongly coupled tasks in data mining. A few points not belonging to any clusters can easily corrupt an otherwise well defined clustering structure. The same problem can be found in meta-clustering, where different clusterings of the same data are clustered to reduce the complexity of the choice of the best partitioning and the number of alternatives to compare. In this paper, the outlier rejection problem is tackled with a rough graded possibilistic medoid meta-clustering algorithm, exploiting its ability to perform a soft transition from probabilistic to possibilistic memberships and its natural rejection of anomalous observations. Outlier detection is hence based on a threshold, where a low memberships of a partition in all meta-clusters identifies observations to be filtered out from the clustering process. The effectiveness of the proposed approach has been assessed by comparing the performance of the meta clustering algorithm with and without clustering outlier detection on synthetic data, yielding promising results.

[1]  Richard Weber,et al.  Soft clustering - Fuzzy and rough approaches and their extensions and derivatives , 2013, Int. J. Approx. Reason..

[2]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[3]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[4]  Alessio Ferone,et al.  Decoy clustering through graded possibilistic c-medoids , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[5]  Sankar K. Pal,et al.  Rough Set Based Generalized Fuzzy $C$ -Means Algorithm and Quantitative Indices , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[7]  Rich Caruana,et al.  Meta Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Alessio Ferone,et al.  Integrating rough set principles in the graded possibilistic clustering , 2019, Inf. Sci..

[9]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[11]  Francesco Napolitano,et al.  Global optimization, Meta Clustering and consensus clustering for class prediction , 2009, 2009 International Joint Conference on Neural Networks.

[12]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.