Comparative Analysis of K-means and K-medoids Algorithm on IRIS Data

Clustering techniques are important methods for the examination of data, predictions based on the examinations and for eliminating the discrepancies observed in them. Iterative techniques are used to group dataset which forms part of a cluster as per collateral and identical characteristics. Clustering is a very useful technique for identifying and grouping the ever growing amount of data generated on daily basis and to generate the patterns and knowledge that can be exploited further. In this paper, we strived to compare K-means and Kmedoids algorithms using the dataset of Iris plants from UCI Machine Learning Repository. The results obtained were in favour of K-medoids algorithm owing to its ability to be better at scalability for the larger dataset and also due to it being more efficient than K-means. K-medoids showed its superiority over k means in execution time, sensitivity towards outlier data and to reduce the noise since it employs the method of minimization of the sum of dissimilarities of datasets.