论文信息 - Robust Clustering by Aggregation and Intersection Methods

Robust Clustering by Aggregation and Intersection Methods

When dealing with multiple clustering solutions, the problem of extrapolating a small number of good different solutions becomes crucial. This problem is faced by the so called Meta Clustering [12], that produces clusters of clustering solutions. Often such groups, called meta-clusters, represent alternative ways of grouping the original data. The next step is to construct a clustering which represents a chosen meta-cluster. In this work, starting from a population of solutions, we build meta-clusters by hierarchical agglomerative approach with respect to an entropy-based similarity measure. The selection of the threshold value is controlled by the user through interactive visualizations. When the meta-cluster is selected, the representative clustering is constructed following two different consensus approaches. The process is illustrated through a synthetic dataset.

[1] Antonino Staiano,et al. A multi-step approach to time series analysis and gene expression clustering , 2006, Bioinform..

[2] Olli Nevalainen,et al. Reallocation of GLA codevectors for evading local minimum , 1996 .

[3] Michele Pinelli,et al. Interactive data analysis and clustering of genomic data , 2008, Neural Networks.

[4] Dan Gusfield,et al. Partition-distance: A problem and class of perfect graphs arising in clustering , 2002, Inf. Process. Lett..

[5] A. Bertoni,et al. Random projections for assessing gene expression cluster stability , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[6] Antonino Staiano,et al. Clustering and visualization approaches for human cell cycle gene expression data analysis , 2008, Int. J. Approx. Reason..

[7] Nabil H. Mustafa,et al. k-means projective clustering , 2004, PODS.

[8] Rui Xu,et al. Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[9] Rich Caruana,et al. Meta Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[10] Jean-Pierre Barthélemy,et al. The Median Procedure for Partitions , 1993, Partitioning Data Sets.

[11] Giorgio Valentini,et al. Characterization of lung tumor subtypes through gene expression cluster validity assessment , 2006, RAIRO Theor. Informatics Appl..

[12] Sam Yuan Sung,et al. Consensus clustering , 2005, Intell. Data Anal..

[13] Ludmila I. Kuncheva,et al. Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Aidong Zhang,et al. Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15] Rich Caruana,et al. Consensus Clusterings , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[16] Petra Perner,et al. Advances in Data Mining , 2002, Lecture Notes in Computer Science.

[17] Anthony Wirth,et al. Are approximation algorithms for consensus clustering worthwhile? , 2007, SDM.

[18] Anil K. Jain,et al. Adaptive clustering ensembles , 2004, ICPR 2004.

[19] Anil K. Jain,et al. Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Y. P. Hu,et al. Global optimization in clustering using hyperbolic cross points , 2007, Pattern Recognit..

[21] Ming-Yang Kao,et al. On constructing an optimal consensus clustering from multiple clusterings , 2007, Inf. Process. Lett..

[22] Aristides Gionis,et al. Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[23] Francesco Napolitano,et al. Using Global Optimization to Explore Multiple Solutions of Clustering Problems , 2008, KES.