Use of multiobjective differential fuzzy clustering with ANN classifier for unsupervised pattern classification: Application to microarray analysis

Microarray technology has made it possible to monitor the expression levels of many genes simultaneously across a number of experimental conditions. Fuzzy clustering is an important tool for analyzing microarray gene expression data. However, this fact motivated us to present a new multiobjective evolutionary algorithm using differential evolution and artificial neural network (ANN) classifier for fuzzy clustering. The proposed approach has two-phases, in one phase, it optimizes multiple cluster validity measures simultaneously to get the resultant set of near-pareto-optimal solutions which contains number of nondominated solutions and in other phase, fraction of the data points selected from different clusters based on their proximity to the respective centres of each clusters for each of the nondominated solutions and thereafter fuzzy voting technique is used to generate the train data set for ANN classifier to classify the remaining data points. Our hybrid approach is tested for two publicly available benchmark microarray data sets. Our results are compared with respect to the multiobjective genetic fuzzy clustering algorithm that used NSGA-II, which is a state-of-the-art for MOEA. Also biological significance test has been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of co-expressed genes.

[1]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[3]  R. Storn,et al.  Differential Evolution - A simple and efficient adaptive scheme for global optimization over continuous spaces , 2004 .

[4]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[5]  Jill Duncan,et al.  Analyzing microarray data using cluster analysis. , 2003, Pharmacogenomics.

[6]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[8]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[9]  Hans-Werner Mewes,et al.  MIPS: a database for protein sequences, homology data and yeast genome information , 1997, Nucleic Acids Res..

[10]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[11]  P. Reymond,et al.  Differential Gene Expression in Response to Mechanical Wounding and Insect Feeding in Arabidopsis , 2000, Plant Cell.

[12]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[13]  Lars Kai Hansen,et al.  Outlier estimation and detection application to skin lesion classification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[15]  Taizo Hanai,et al.  Analysis of expression profile using fuzzy adaptive resonance theory , 2002, Bioinform..

[16]  Seo Young Kim,et al.  Effect of data normalization on fuzzy clustering of DNA microarray data , 2005, BMC Bioinformatics.

[17]  Brian Everitt,et al.  Cluster analysis , 1974 .

[18]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[19]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[20]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[21]  E. Domany Cluster Analysis of Gene Expression Data , 2002, physics/0206056.

[22]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[23]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[24]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[25]  Ron Shamir,et al.  CLICK and EXPANDER: a system for clustering and visualizing gene expression data , 2003, Bioinform..