Cluster ensemble for gene expression microarray data

Ensemble techniques have been successfully applied in the context of supervised learning to increase the accuracy and stability of classification. Recently, similar techniques have been proposed for clustering algorithms. In this context, we analyze the potential of applying cluster ensemble techniques to gene expression microarray data. Our experimental results show that there is often a significant improvement in the results obtained with the use of ensemble when compared to those based on the clustering techniques used individually.

[1]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[2]  G. W. Milligan,et al.  A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis. , 1986, Multivariate behavioral research.

[3]  Anil K. Jain,et al.  Combining multiple weak clusterings , 2003, Third IEEE International Conference on Data Mining.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  John Quackenbush,et al.  Computational genetics: Computational analysis of microarray data , 2001, Nature Reviews Genetics.

[6]  Francisco de A. T. de Carvalho,et al.  Comparative study on proximity indices for cluster analysis of gene expression time series , 2002, J. Intell. Fuzzy Syst..

[7]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[8]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[9]  G. W. Milligan,et al.  A study of standardization of variables in cluster analysis , 1988 .

[10]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[11]  Derek Greene,et al.  Ensemble clustering in medical diagnostics , 2004 .

[12]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[13]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  Guang R. Gao,et al.  An adaptive meta-clustering approach: combining the information from different clustering results , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[16]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[17]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[18]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[19]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[20]  RICHARD C. DUBES,et al.  How many clusters are best? - An experiment , 1987, Pattern Recognit..

[21]  Kurt Hornik,et al.  A Cluster Ensembles Framework , 2003, HIS.

[22]  Xiaohua Hu,et al.  Cluster Ensemble and Its Applications in Gene Expression Analysis , 2004, APBC.