Alpha-plane based automatic general type-2 fuzzy clustering based on simulated annealing meta-heuristic algorithm for analyzing gene expression data

This paper considers microarray gene expression data clustering using a novel two stage meta-heuristic algorithm based on the concept of α-planes in general type-2 fuzzy sets. The main aim of this research is to present a powerful data clustering approach capable of dealing with highly uncertain environments. In this regard, first, a new objective function using α-planes for general type-2 fuzzy c-means clustering algorithm is represented. Then, based on the philosophy of the meta-heuristic optimization framework 'Simulated Annealing', a two stage optimization algorithm is proposed. The first stage of the proposed approach is devoted to the annealing process accompanied by its proposed perturbation mechanisms. After termination of the first stage, its output is inserted to the second stage where it is checked with other possible local optima through a heuristic algorithm. The output of this stage is then re-entered to the first stage until no better solution is obtained. The proposed approach has been evaluated using several synthesized datasets and three microarray gene expression datasets. Extensive experiments demonstrate the capabilities of the proposed approach compared with some of the state-of-the-art techniques in the literature. Presenting a new two-stage meta-heuristic clustering algorithm based on general type-2 fuzzy sets.Incorporating a new similarity-based objective function using alpha-plane representation of general type-2 fuzzy sets.Implementing the proposed approach on real microarray gene expression datasets.

[1]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[2]  I. Burhan Türksen,et al.  Validation criteria for enhanced fuzzy clustering , 2008, Pattern Recognit. Lett..

[3]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[4]  Zhaohui S. Qin,et al.  Clustering microarray gene expression data using weighted Chinese restaurant process , 2006, Bioinform..

[5]  Ulas Bagci,et al.  Segmentation of PET Images for Computer-Aided Functional Quantification of Tuberculosis in Small Animal Models , 2014, IEEE Transactions on Biomedical Engineering.

[6]  Witold Pedrycz,et al.  Type-2 fuzzy neural networks with fuzzy clustering and differential evolution optimization , 2011, Inf. Sci..

[7]  Frank Chung-Hoon Rhee,et al.  Uncertain Fuzzy Clustering: Interval Type-2 Fuzzy Approach to $C$-Means , 2007, IEEE Transactions on Fuzzy Systems.

[8]  Y. Fukuyama,et al.  A new method of choosing the number of clusters for the fuzzy c-mean method , 1989 .

[9]  Milos Manic,et al.  Monotone Centroid Flow Algorithm for Type Reduction of General Type-2 Fuzzy Sets , 2012, IEEE Transactions on Fuzzy Systems.

[10]  Mohammad Hossein Fazel Zarandi,et al.  Interval type-2 fuzzy expert system for prediction of carbon monoxide concentration in mega-cities , 2012, Appl. Soft Comput..

[11]  Yuh-Min Chen,et al.  Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method , 2011, Expert Syst. Appl..

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Oscar Castillo,et al.  Optimization of type-2 fuzzy systems based on bio-inspired methods: A concise review , 2012, Inf. Sci..

[14]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Mohammad Hossein Fazel Zarandi,et al.  A new indirect approach to the type-2 fuzzy systems modeling and design , 2013, Inf. Sci..

[16]  Mohammad Hossein Fazel Zarandi,et al.  A new validation criteria for type-2 fuzzy c-means and possibilistic c-means , 2012, 2012 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS).

[17]  Mohammad Hossein Fazel Zarandi,et al.  A hybrid fuzzy-ontology based intelligent system to determine level of severity and treatment recommendation for Benign Prostatic Hyperplasia , 2014, Comput. Methods Programs Biomed..

[18]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[19]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[20]  Athanasios K. Tsakalidis,et al.  OLYMPUS: An automated hybrid clustering method in time series gene expression. Case study: Host response after Influenza A (H1N1) infection , 2013, Comput. Methods Programs Biomed..

[21]  Mohammad Hossein Fazel Zarandi,et al.  A New Cluster Validity Index for Fuzzy Clustering Based on Similarity Measure. , 2007 .

[22]  Mohammad Hossein Fazel Zarandi,et al.  Retracted Article: A New Cluster Validity Index for Fuzzy Clustering Based on Similarity Measure , 2009 .

[23]  M. Gonzalo Claros,et al.  Robust gene signatures from microarray data using genetic algorithms enriched with biological pathway keywords , 2014, J. Biomed. Informatics.

[24]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[25]  Jerry M. Mendel,et al.  Type-2 fuzzy sets made simple , 2002, IEEE Trans. Fuzzy Syst..

[26]  Rajat K. De,et al.  Interval based fuzzy systems for identification of important genes from microarray gene expression data: Application to carcinogenic development , 2009, J. Biomed. Informatics.

[27]  Sanghamitra Bandyopadhyay,et al.  Analysis of Biological Data: A Soft Computing Approach , 2007, Science, Engineering, and Biology Informatics.

[28]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[29]  Rainer Fuchs,et al.  Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters , 2001, Bioinform..

[30]  Jerry M. Mendel,et al.  Centroid of a type-2 fuzzy set , 2001, Inf. Sci..

[31]  Semra Içer,et al.  Automatic segmentation of corpus collasum using Gaussian mixture modeling and Fuzzy C means methods , 2013, Comput. Methods Programs Biomed..

[32]  José Cristóbal Riquelme Santos,et al.  TriGen: A genetic algorithm to mine triclusters in temporal gene expression data , 2014, Neurocomputing.

[33]  Ujjwal Maulik,et al.  Simulated annealing based automatic fuzzy clustering combined with ANN classification for analyzing microarray data , 2010, Comput. Oper. Res..

[34]  Jerry M. Mendel,et al.  $\alpha$-Plane Representation for Type-2 Fuzzy Sets: Theory and Applications , 2009, IEEE Transactions on Fuzzy Systems.

[35]  Jerry M. Mendel,et al.  A comparative study of ranking methods, similarity measures and uncertainty measures for interval type-2 fuzzy sets , 2009, Inf. Sci..

[36]  Seo Young Kim,et al.  Effect of data normalization on fuzzy clustering of DNA microarray data , 2005, BMC Bioinformatics.

[37]  Jerry M. Mendel,et al.  Computing the centroid of a general type-2 fuzzy set by means of the centroid-flow algorithm , 2011, IEEE Transactions on Fuzzy Systems.

[38]  I. Burhan Türksen,et al.  MiniMax ε-stable cluster validity index for type-2 fuzziness , 2010, 2010 Annual Meeting of the North American Fuzzy Information Processing Society.

[39]  Young-Il Kim,et al.  A cluster validation index for GK cluster analysis based on relative degree of sharing , 2004, Inf. Sci..

[40]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Mohammad Hossein Fazel Zarandi,et al.  A new cluster validity measure based on general type-2 fuzzy sets: Application in gene expression data clustering , 2014, Knowl. Based Syst..

[42]  Milos Manic,et al.  General Type-2 Fuzzy C-Means Algorithm for Uncertain Fuzzy Clustering , 2012, IEEE Transactions on Fuzzy Systems.

[43]  C. Müller,et al.  Large-scale clustering of cDNA-fingerprinting data. , 1999, Genome research.

[44]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Fernando Díaz,et al.  An evolutionary computational model applied to cluster analysis of DNA microarray data , 2013, Expert Syst. Appl..

[46]  Asifullah Khan,et al.  Robust information gain based fuzzy c-means clustering and classification of carotid artery ultrasound images , 2014, Comput. Methods Programs Biomed..

[47]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Soon-H. Kwon Cluster validity index for fuzzy clustering , 1998 .

[49]  J. Mendel Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions , 2001 .

[50]  Feilong Liu,et al.  An efficient centroid type-reduction strategy for general type-2 fuzzy logic system , 2008, Inf. Sci..