Hierarchical Clustering Algorithms in Data Mining

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering. Keywords—Clustering, method, algorithm, hierarchical, survey.

[1]  Hisashi Koga,et al.  Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing , 2007, Knowledge and Information Systems.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Yunjun Gao,et al.  Towards effective and efficient mining of arbitrary shaped clusters , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[4]  Yuri Malitsky,et al.  Algorithm Portfolios Based on Cost-Sensitive Hierarchical Clustering , 2013, IJCAI.

[5]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[6]  V. S. Murthy,et al.  Content Based Image Retrieval using Hierarchical and K-Means Clustering Techniques , 2010 .

[7]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[8]  Fionn Murtagh,et al.  Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion? , 2011, Journal of Classification.

[9]  Arun K. Pujari,et al.  QROCK: A quick version of the ROCK algorithm for clustering of categorical data , 2005, Pattern Recognit. Lett..

[10]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[11]  Shi-Jinn Horng,et al.  A novel intrusion detection system based on hierarchical clustering and support vector machines , 2011, Expert Syst. Appl..

[12]  Guan Xin,et al.  EEHCA: An Energy-Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks , 2008 .

[13]  Sueli Aparecida Mingoti,et al.  Comparing SOM neural network with Fuzzy c , 2006, Eur. J. Oper. Res..

[14]  Maria-Florina Balcan,et al.  Robust hierarchical clustering , 2013, J. Mach. Learn. Res..

[15]  Peter Langfelder,et al.  Fast R Functions for Robust Correlations and Hierarchical Clustering. , 2012, Journal of statistical software.

[16]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[17]  Abdolreza Mirzaei,et al.  An information theoretic approach to hierarchical clustering combination , 2015, Neurocomputing.

[18]  Yufei Huang,et al.  Enrichment constrained time-dependent clustering analysis for finding meaningful temporal transcription modules , 2009, Bioinform..

[19]  Daniel Müllner,et al.  fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python , 2013 .

[20]  Yunpeng Cai,et al.  ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time , 2011, Nucleic acids research.

[21]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[22]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[23]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[24]  László Szilágyi,et al.  A fast hierarchical clustering algorithm for large-scale protein sequence data sets , 2014, Comput. Biol. Medicine.

[25]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[26]  Limsoon Wong,et al.  DATA MINING TECHNIQUES , 2003 .

[27]  Chunhong Pan,et al.  Sparse Hierarchical Clustering for VHR Image Change Detection , 2015, IEEE Geoscience and Remote Sensing Letters.

[28]  Andreas T. Ernst,et al.  Solution algorithms for the capacitated single allocation hub location problem , 1999, Ann. Oper. Res..

[29]  Gang Kou,et al.  Multiple factor hierarchical clustering algorithm for large scale web page and search engine clickstream data , 2012, Ann. Oper. Res..

[30]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[31]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[32]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[33]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[34]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[35]  Marina Meila,et al.  An Experimental Comparison of Several Clustering and Initialization Methods , 1998, UAI.

[36]  Sivaraman Balakrishnan,et al.  Efficient Active Algorithms for Hierarchical Clustering , 2012, ICML.

[37]  Kai Liu,et al.  A fast divisive clustering algorithm using an improved discrete particle swarm optimizer , 2010, Pattern Recognit. Lett..

[38]  Mark J. van der Laan,et al.  A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap , 2003 .

[39]  Osama Abu Abbas,et al.  Comparisons Between Data Clustering Algorithms , 2008, Int. Arab J. Inf. Technol..