Review and Comparative Study of Clustering Techniques

Clustering is an automatic learning technique which aims at grouping a set of objects into clusters so that objects in the same clusters should be similar as possible, whereas objects in one cluster should be as dissimilar as possible from objects in other clusters. Document clustering aims to group in an unsupervised way, a given document set into clusters such that documents within each clusters are more similar between each other than those in different clusters. Cluster analysis aims to organize a collection of patterns into clusters based on similarity. This paper focuses on survey of various clustering techniques. These techniques can be divided into several categories: Partitional algorithms, Hierarchical algorithms, Density based, and comparison of various algorithm is surveyed and shows how Hierarchical Clustering can be better than other techniques.

[1]  Joydeep Ghosh,et al.  Under Consideration for Publication in Knowledge and Information Systems Generative Model-based Document Clustering: a Comparative Study , 2003 .

[2]  D. R. Ramesh Babu,et al.  A Novel Scheme for Term Weighting in Text Categorization: Positive Impact Factor , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[3]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[4]  S. Priya,et al.  The Clustering with Multi-Viewpoint based Similarity Measure , 2012 .

[5]  Mohamed S. Kamel,et al.  Efficient phrase-based document indexing for Web document clustering , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Navneet Kaur,et al.  Survey Paper on Clustering Techniques , 2013 .

[7]  Y. Rani,et al.  A Study of Hierarchical Clustering Algorithm , 2013 .

[8]  Lingjun Kong,et al.  A Document Clustering Method Based on Hierarchical Algorithm with Model Clustering , 2008, 22nd International Conference on Advanced Information Networking and Applications - Workshops (aina workshops 2008).

[9]  Robert D. Nowak,et al.  Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities , 2011, AISTATS.

[10]  L. Infante,et al.  Hierarchical Clustering , 2020, International Encyclopedia of Statistical Science.

[11]  T. Soni Madhulatha,et al.  An Overview on Clustering Methods , 2012, ArXiv.

[12]  Glory H. Shah,et al.  An Empirical Evaluation of Density-Based Clustering Techniques , 2012 .

[13]  P. Krishnakumari,et al.  Clustering with Multi view point-Based Similarity Measure using NMF , 2013 .

[14]  Marjan Kuchaki Rafsanjani,et al.  A Survey Of Hierarchical Clustering Algorithms , 2012 .

[15]  K. Sathiyakumari,et al.  A Survey on Various Approaches in Document Clustering , 2011 .

[16]  Sarbeswara Hota,et al.  A Survey on Partitioning and Parallel Partitioning Clustering Algorithms , 2012 .

[17]  Satyam Maheswari,et al.  Survey of Recent Clustering Techniques in Data Mining , 2012 .

[18]  Ashish Jaiswal,et al.  Hierarchical Document Clustering: A Review , 2011 .