Exploration of Various Clustering Algorithms for Text Mining

Due to the current encroachments in technology and also sharp lessening of storage cost, huge extents of documents are being put away in repositories for future references. At the same time, it is time consuming as well as costly to recover the user intrigued documents, out of these gigantic accumulations. Searching of documents can be made more efficient and effective if documents are clustered on the premise of their contents. This article uncovers a comprehensive discussion on various clustering algorithm used in text mining alongside their merits, demerits and comparisons. Further, author has likewise examined the key challenges of clustering algorithms being used for effective clustering of documents.

[1]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[2]  Rizwan Ahmad Document Topic Generation in Text Mining by Using Cluster Analysis with EROCK , 2010 .

[3]  Wesam M. Ashour,et al.  Improved Multi Threshold Birch Clustering Algorithm , 2014 .

[4]  A RaghuviraPratap,et al.  An Efficient Density based Improved K- Medoids Clustering algorithm , 2011 .

[5]  Bingru Yang,et al.  An improved k-medoids clustering algorithm , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[6]  A. Pujari An Efficient Clustering Algorithm for Outlier Detection , 2020 .

[7]  Anand Rajavat,et al.  Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads , 2015 .

[8]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[9]  Subhash Kumar,et al.  Graph based Text Document Clustering by Detecting Initial Centroids for k-Means , 2013 .

[10]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[11]  P. Murugavel,et al.  Improved Hybrid Clustering and Distance-based Technique for Outlier Removal , 2011 .

[12]  Chunping Li,et al.  Improved ROCK for Text Clustering Using Asymmetric Proximity , 2006, SOFSEM.

[13]  Arindam Roy,et al.  A Comparative Analysis of Particle Swarm Optimization and K-means Algorithm For Text Clustering Using Nepali Wordnet , 2014 .

[14]  Manu Konchady Text Mining Application Programming , 2006 .

[15]  Monica Jha Document Clustering using K-Medoids , 2015, ArXiv.

[16]  Radhika Kyadagiri,et al.  An Efficient Density based Improved K- Medoids Clustering algorithm , 2012 .

[17]  P. Jaganathan,et al.  An Appropriate Similarity Measure for K-Means Algorithm in Clustering Web Documents , 2015 .

[18]  Lixin Ding,et al.  A genetic evolutionary ROCK algorithm , 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).