Categorical data analysis using multiobjective differential evolution based fuzzy clustering

During the last one decade, rapid proliferation of categorical data attracts computer scientists and engineers to analyse the categorical data. In this regard, unsupervised technique, such as clustering has been used by inception of new algorithms or modification of the existing ones. These methods are basically optimizing single objective function to get the partitions. However, optimization of multiple conflicting objectives simultaneously may evolve better clustering results as the multiobjective optimization techniques have been successfully applied in various fields of engineering and science. Hence, in this article, Multiobjective Differential Evolution based Fuzzy Clustering for Categorical Data is proposed. For this purpose, differential evolution is used as an underlying optimization technique. Moreover, the index encoding scheme is used to encode the vector in differential evolution and after the mutation operation in differential evolution, scaling is introduced to adjust the encoded index value within the permissible range of categorical objects. The performance of the proposed method has been demonstrated by comparing it with the widely used state-of-the-art methods for two synthetic and two real life data sets. Finally, statistical test has been conducted to judge the superiority of the proposed method.

[1]  Ujjwal Maulik,et al.  Fuzzy clustering of physicochemical and biochemical properties of amino Acids , 2011, Amino Acids.

[2]  Ujjwal Maulik,et al.  Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis , 2011, Expert Syst. Appl..

[3]  Ujjwal Maulik,et al.  Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery , 2009, Pattern Recognit..

[4]  Anirban Mukhopadhyay,et al.  Improved Crisp and Fuzzy Clustering Techniques for Categorical Data , 2008 .

[5]  Jianyong Wang,et al.  On efficiently summarizing categorical databases , 2005, Knowledge and Information Systems.

[6]  Sivakumar Ramakrishnan,et al.  A survey: hybrid evolutionary algorithms for cluster analysis , 2011, Artificial Intelligence Review.

[7]  Philip S. Yu,et al.  Finding Localized Associations in Market Basket Data , 2002, IEEE Trans. Knowl. Data Eng..

[8]  Ujjwal Maulik,et al.  A new multi-objective technique for differential fuzzy clustering , 2011, Appl. Soft Comput..

[9]  Dan A. Simovici,et al.  Finding Median Partitions Using Information-Theoretical-Based Genetic Algorithms , 2002, J. Univers. Comput. Sci..

[10]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[12]  Ujjwal Maulik,et al.  Automatic Fuzzy Clustering Using Modified Differential Evolution for Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[14]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[15]  Ujjwal Maulik,et al.  SVMeFC: SVM Ensemble Fuzzy Clustering for Satellite Image Segmentation , 2012, IEEE Geoscience and Remote Sensing Letters.

[16]  B B B X R X X,et al.  MMR : AN ALGORITHM FOR CLUSTERING CATEGORICAL DATA USING ROUGH SET THEORY , 2007 .

[17]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[18]  J. Bezdek,et al.  VAT: a tool for visual assessment of (cluster) tendency , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[19]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[20]  Neil Wrigley,et al.  Categorical Data Analysis for Geographers and Environmental Scientists , 1985 .

[21]  R. Krishnapuram,et al.  A fuzzy relative of the k-medoids algorithm with application to web document and snippet clustering , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[22]  Ujjwal Maulik,et al.  Integrating Clustering and Supervised Learning for Categorical Data Analysis , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[23]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[24]  Dariusz Plewczynski,et al.  Consensus classification of human leukocyte antigen class II proteins , 2012, Immunogenetics.

[25]  Zengyou He,et al.  G-ANMI: A mutual information based genetic clustering algorithm for categorical data , 2010, Knowl. Based Syst..

[26]  Christian Callegari,et al.  Advances in Computing, Communications and Informatics (ICACCI) , 2015 .

[27]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[28]  Liang Bai,et al.  A dissimilarity measure for the k-Modes clustering algorithm , 2012, Knowl. Based Syst..

[29]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.