Ensemble-based multi-objective clustering algorithms for gene expression data sets

In this paper, two multi-objective clustering ensemble algorithms are proposed named MOCLED and MOCNCD. MOCLED is different from MOCLE on three points. First, different clustering algorithms are used to produce some new individuals in evolutionary process. Second, a new screening mechanism is added. In each generation, the worst individual is replaced by the best individual. Third, a new objective function is added to ensure a diverse population. MOCNCD is the same as MOCLED except the crossover operator. We replace it with a new proposed cluster ensemble algorithm, IDICLENS. Experimental results reveal the advantages of our method on finding good partitions.

[1]  D. Botstein,et al.  Gene expression patterns in human liver cancers. , 2002, Molecular biology of the cell.

[2]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[3]  A. Mukhopadhyay,et al.  Clustering Ensemble: A Multiobjective Genetic Algorithm based Approach , 2013 .

[4]  Ludmila I. Kuncheva,et al.  Experimental Comparison of Cluster Ensemble Methods , 2006, 2006 9th International Conference on Information Fusion.

[5]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[6]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[7]  L. Aaltonen,et al.  Serrated carcinomas form a subclass of colorectal cancer with distinct molecular basis , 2007, Oncogene.

[8]  R. Tibshirani,et al.  Gene expression profiling identifies clinically relevant subtypes of prostate cancer. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Multi-objective clustering ensemble for gene expression data analysis , 2009, Neurocomputing.

[10]  Dejan Juric,et al.  Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. , 2005, Cancer research.

[11]  D. Botstein,et al.  Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[13]  Selim Mimaroglu,et al.  DICLENS: Divisive Clustering Ensemble with Automatic Cluster Number , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[16]  Torben F. Ørntoft,et al.  Identifying distinct classes of bladder carcinoma using microarrays , 2003, Nature Genetics.

[17]  E. Lander,et al.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , 2002, Nature Genetics.

[18]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[19]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Multi-Objective Clustering Ensemble with Prior Knowledge , 2007, BSB.

[20]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[21]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[22]  John Quackenbush,et al.  Computational genetics: Computational analysis of microarray data , 2001, Nature Reviews Genetics.

[23]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[24]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[25]  Yi Zhang,et al.  Prognostic gene expression signatures can be measured in tissues collected in RNAlater preservative. , 2006, The Journal of molecular diagnostics : JMD.

[26]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[28]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[29]  Susmita Datta,et al.  Comparisons and validation of statistical clustering techniques for microarray gene expression data , 2003, Bioinform..

[30]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[31]  G. W. Milligan,et al.  A study of standardization of variables in cluster analysis , 1988 .

[32]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[33]  Francisco de A. T. de Carvalho,et al.  Comparative analysis of clustering methods for gene expression time course data , 2004, Genetics and Molecular Biology.

[34]  Alexander Schliep,et al.  Clustering cancer gene expression data: a comparative study , 2008, BMC Bioinformatics.