Cluster Analysis on Complex Structured and high Dimensional Data Objects using K-means and EM Algorithms

73 Abstract: Cluster Analysis plays an outstanding role in data mining applications such as scientific data exploration, information retrieval, text mining, web analysis, marketing and many other application areas. Large number of clustering algorithms has been developed in a variety of domains for different types of applications. None of these clustering algorithms are suitable for all type of data, so finding out the characteristics of each partitioning clustering is important. The main objective in this work is to find out the performance of the partition clustering techniques in terms of complex data objects and comparative study of the cluster algorithm for corresponding data and proximity measure for specific objective function. In this paper we compare and evaluate clustering algorithms with multiple data sets, like text, business, and stock market data. Comparative study of clustering algorithms can identify one or more problematic factors such as high dimensionality, efficiency, scalability with data size, sensitivity to noise in the data.