Clustering is one of the most important research areas in the field of data mining. In simple words, clustering is a division of data into different groups. Data are grouped into clusters in such a way that data of the same group are similar and those in other groups are dissimilar. It aims to minimize intra-class similarity while to maximize interclass dissimilarity. Clustering is an unsupervised learning technique. Clustering is useful to obtain interesting patterns and structures from a large set of data. Clustering can be applied in many areas, such as marketing studies, DNA analyses, city planning, text mining, and web documents classification. Large datasets with many attributes make the task of clustering complex. Many methods have been developed to deal with these problems. In this paper, two well known partitioning based methods - k-means and k- medoids - are compared. The study given here explores the behavior of these two methods.
[1]
Anil K. Jain.
Data clustering: 50 years beyond K-means
,
2010,
Pattern Recognit. Lett..
[2]
T. Velmurugan,et al.
A Survey of Partition based Clustering Algorithms in Data Mining: An Experimental Approach
,
2011
.
[3]
Jiawei Han,et al.
Data Mining: Concepts and Techniques
,
2000
.
[4]
Dauid F. Percy.
Cluster Analysis (3rd Edition)
,
1994
.
[5]
Brian Everitt,et al.
Cluster analysis
,
1974
.
[6]
Jiawei Han.
Data mining techniques
,
1996,
SIGMOD '96.
[7]
Anil K. Jain,et al.
Data clustering: a review
,
1999,
CSUR.