One Clustering Process Fits All - A Visually Guided Ensemble Approach

Looking back on the past decade of research on clustering algorithms, we witness two major and apparent trends: 1) The already vast amount of existing clustering algorithms, is continuously broadened and 2) clustering algorithms in general, are becoming more and more adapted to specific application domains with very particular assumptions. As a result, algorithms have grown complicated and/or very scenariodependent, which made clustering a hardly accessible domain for non-expert users. This is an especially critical development, since, due to increasing data gathering, the need for analysis techniques like clustering emerges in many application domains. In this paper, we oppose the current focus on specialization, by proposing our vision of a usable, guided and universally applicable clustering process. In detail, we are going to describe our already conducted work and present our future research directions.

[1]  Alexander Hinneburg Visualizing Clustering Results , 2009, Encyclopedia of Database Systems.

[2]  Kurt Hornik,et al.  Voting-Merging: An Ensemble Method for Clustering , 2001, ICANN.

[3]  Ana L. N. Fred,et al.  Robust data clustering , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[4]  Rich Caruana,et al.  Meta Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[5]  Guang R. Gao,et al.  An adaptive meta-clustering approach: combining the information from different clustering results , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[6]  Wolfgang Lehner,et al.  Using Cloud Technologies to Optimize Data-Intensive Service Applications , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[7]  Anil K. Jain,et al.  Adaptive clustering ensembles , 2004, ICPR 2004.

[8]  Wolfgang Lehner,et al.  BPEL-DT - Data-aware Extension of BPEL to Support Data-Intensive Service Applications , 2007, WEWST.

[9]  Wolfgang Lehner,et al.  Visual Decision Support for Ensemble Clustering , 2010, SSDBM.

[10]  Ana L. N. Fred,et al.  Finding Consistent Clusters in Data Partitions , 2001, Multiple Classifier Systems.

[11]  Wolfgang Lehner,et al.  Two-phase clustering strategy for gene expression data sets , 2006, SAC '06.

[12]  Mari Ostendorf,et al.  Combining Multiple Clustering Systems , 2004, PKDD.

[13]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  Wolfgang Lehner,et al.  Data-Grey-BoxWeb Services in Data-Centric Environments , 2007, IEEE International Conference on Web Services (ICWS 2007).

[15]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[16]  Wolfgang Lehner,et al.  How to Control Clustering Results? Flexible Clustering Aggregation , 2009, IDA.

[17]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[18]  Andreas Stafylopatis,et al.  A Multi-clustering Fusion Algorithm , 2002, SETN.

[19]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[20]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.