A novel multi-clustering method for hierarchical clusterings based on boosting

Bagging and boosting are proved to be the best methods of building multiple classifiers in classification combination problems. In the area of "flat clustering" problems, it is also recognized that multi-clustering methods based on boosting provide clusterings of an improved quality. In this paper, we introduce a novel multi-clustering method for "hierarchical clusterings" based on boosting theory, which creates a more stable hierarchical clustering of a dataset. The proposed algorithm includes a boosting iteration in which a bootstrap of samples is created by weighted random sampling of elements from the original dataset. A hierarchical clustering algorithm is then applied on selected subsample to build a dendrogram which describes the hierarchy. Finally, dissimilarity description matrices of multiple dendrogram results are combined to a consensus one, using a hierarchical-clustering-combination approach. Experiments on real popular datasets show that boosted method provides superior quality solutions compared to standard hierarchical clustering methods.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Abdolreza Mirzaei,et al.  A Novel Hierarchical-Clustering-Combination Scheme Based on Fuzzy-Similarity Relations , 2010, IEEE Transactions on Fuzzy Systems.

[3]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[4]  Jonathan Chang Mixtures-of-Clusterings by Boosting , 2009 .

[5]  Yang Wang,et al.  Boosting an associative classifier , 2006, IEEE Transactions on Knowledge and Data Engineering.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[9]  D. B. Duncan MULTIPLE RANGE AND MULTIPLE F TESTS , 1955 .

[10]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[11]  Anil K. Jain,et al.  Adaptive clustering ensembles , 2004, ICPR 2004.

[12]  William F. Punch,et al.  Ensembles of partitions via data resampling , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[13]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[14]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[15]  Majid Ahmadi,et al.  A new method for hierarchical clustering combination , 2008, Intell. Data Anal..

[16]  Carlotta Domeniconi,et al.  Weighted Clustering Ensembles , 2006, SDM.

[17]  Abdolreza Mirzaei,et al.  Combining hierarchical clusterings using min-transitive closure , 2008, 2008 19th International Conference on Pattern Recognition.