Outlier Detection Using Rough Sets

Clustering-based methods for outlier detection are preferred in many contemporary applications due to the abundance of methods available for data clustering. However, the uncertainty regarding the cluster membership of an outlier object needs to be handled appropriately during the clustering process. Addressing this issue, this chapter delves on soft computing methodologies based on rough sets for clustering data involving outliers. In specific, the case of data comprising categorical attributes is looked at in detail for carrying out outlier detection through clustering by employing rough sets. Experimental observations on benchmark data sets indicate that soft computing techniques indeed produce promising results for outlier detection over their counterparts.

[1]  Hongjuan Mi Discovering Local Outlier Based on Rough Clustering , 2011, 2011 3rd International Workshop on Intelligent Systems and Applications.

[2]  M. Narasimha Murty,et al.  A ranking-based algorithm for detection of outliers in categorical data , 2014, Int. J. Hybrid Intell. Syst..

[3]  Cungen Cao,et al.  Some issues about outlier detection in rough set theory , 2009, Expert Syst. Appl..

[4]  Jennifer Blackhurst,et al.  MMR: An algorithm for clustering categorical data using Rough Set Theory , 2007, Data Knowl. Eng..

[5]  M. Narasimha Murty,et al.  A Rough Clustering Algorithm for Mining Outliers in Categorical Data , 2013, PReMI.

[6]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[7]  Pawan Lingras,et al.  Applying Rough Set Concepts to Clustering , 2012 .

[8]  Sankar K. Pal,et al.  Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[10]  Mustafa Kaiiali,et al.  A Rough Set based PCM for authorizing grid resources , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[11]  Pawan Lingras,et al.  Interval Set Clustering of Web Users with Rough K-Means , 2004, Journal of Intelligent Information Systems.

[12]  Georg Peters,et al.  Some refinements of rough k-means clustering , 2006, Pattern Recognit..

[13]  Andrzej Skowron,et al.  Rough sets and fuzzy sets in natural computing , 2011, Theor. Comput. Sci..

[14]  M. Narasimha Murty,et al.  Rough set based incremental clustering of interval data , 2006, Pattern Recognit. Lett..

[15]  Joshua Zhexue Huang,et al.  A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining , 1997, DMKD.

[16]  M. Narasimha Murty,et al.  Detecting outliers in categorical data through rough clustering , 2016, Natural Computing.

[17]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[18]  Thierry Denoeux,et al.  ECM: An evidential version of the fuzzy c , 2008, Pattern Recognit..

[19]  Lei Wang,et al.  Hierarchical clustering algorithm for categorical data using a probabilistic rough set model , 2014, Knowl. Based Syst..

[20]  Sankar K. Pal,et al.  Soft computing data mining , 2004, Inf. Sci..

[21]  Pawan Lingras,et al.  Enhancing Rough Clustering with Outlier Detection Based on Evidential Clustering , 2013, RSFDGrC.

[22]  Pawan Lingras,et al.  Rough set clustering for Web mining , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).