论文信息 - Comparative Analysis of K-means and K-medoids Algorithm on IRIS Data

Comparative Analysis of K-means and K-medoids Algorithm on IRIS Data

Clustering techniques are important methods for the examination of data, predictions based on the examinations and for eliminating the discrepancies observed in them. Iterative techniques are used to group dataset which forms part of a cluster as per collateral and identical characteristics. Clustering is a very useful technique for identifying and grouping the ever growing amount of data generated on daily basis and to generate the patterns and knowledge that can be exploited further. In this paper, we strived to compare K-means and Kmedoids algorithms using the dataset of Iris plants from UCI Machine Learning Repository. The results obtained were in favour of K-medoids algorithm owing to its ability to be better at scalability for the larger dataset and also due to it being more efficient than K-means. K-medoids showed its superiority over k means in execution time, sensitivity towards outlier data and to reduce the noise since it employs the method of minimization of the sum of dissimilarities of datasets.

Kalpit G. Soni | D. A. Patel | Dr. Atul Patel

[1] M Joseph,et al. Significance of Data Warehousing and Data Mining in Business Applications , 2013 .

[2] Dimitrios Gunopulos,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[3] Shalini S Singh,et al. K-means v/s K-medoids: A Comparative Study , 2011 .

[4] T. Soni Madhulatha,et al. An Overview on Clustering Methods , 2012, ArXiv.

[5] Peter J. Rousseeuw,et al. Clustering by means of medoids , 1987 .

[6] Preeti Arora,et al. Analysis of K-Means and K-Medoids Algorithm For Big Data , 2016 .

[7] GunopulosDimitrios,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998 .

[8] Abhishek Patel,et al. New Approach for K-mean and K-medoids Algorithm , 2012 .

[9] Anuj Gupta,et al. Classification Of Complex UCI Datasets Using Machine Learning And Evolutionary Algorithms , 2015 .

[10] Anil K. Jain. Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[11] Zahir Tari,et al. A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[12] K. S. Kadam,et al. Fuzzy Hyperline Segment Neural Network Pattern Classifier with Different Distance Metrics , 2014 .

[13] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .