A Review on Data Clustering Algorithms for Mixed Data

2 Abstract-Clustering is the unsupervised classification of patterns into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. In general, clustering is a method of dividing the data into groups of similar objects. One of significant research areas in data mining is to develop methods to modernize knowledge by using the existing knowledge, since it can generally augment mining efficiency, especially for very bulky database. Data mining uncovers hidden, previously unknown, and potentially useful information from large amounts of data. This paper presents a general survey of various clustering algorithms. In addition, the paper also describes the efficiency of Self-Organized Map (SOM) algorithm in enhancing the mixed data clustering Keywords-Data Clustering, Data Mining, Mixed Data Clustering, Self-Organized Map algorithm.

[1]  Jing Liu,et al.  A New Supervised Clustering Algorithm for Data Set with Mixed Attributes , 2007 .

[2]  Gautam Biswas,et al.  ITERATE: a conceptual clustering algorithm for data mining , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[3]  Gao Xinbo,et al.  A CSA-based clustering algorithm for large data sets with mixed numeric and categorical values , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[4]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[5]  Yao Wang,et al.  A robust and scalable clustering algorithm for mixed type attributes in large database environment , 2001, KDD '01.

[6]  Jiancheng Luo,et al.  A modified clustering algorithm for data mining , 2005, Proceedings. 2005 IEEE International Geoscience and Remote Sensing Symposium, 2005. IGARSS '05..

[7]  Yaxin Bi,et al.  Improving classification decisions by multiple knowledge , 2005, 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05).

[8]  Joachim M. Buhmann,et al.  Bagging for Path-Based Clustering , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007 .

[10]  Manas Ranjan Patra,et al.  SOME CLUSTERING ALGORITHMS TO ENHANCE THE PERFORMANCE OF THE NETWORK INTRUSION DETECTION SYSTEM , 2008 .

[11]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition : methods that search for structures in data , 1992 .

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Michael N. Vrahatis,et al.  The New k-Windows Algorithm for Improving the k-Means Clustering Algorithm , 2002, J. Complex..

[14]  Joachim M. Buhmann,et al.  Path Based Pairwise Data Clustering with Application to Texture Segmentation , 2001, EMMCVPR.

[15]  E. Schwartz,et al.  Isoperimetric Graph Partitioning for Data Clustering and Image Segmentation , 2003 .

[16]  Jian Yin,et al.  An efficient clustering algorithm for mixed type attributes in large dataset , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[17]  Joachim M. Buhmann,et al.  Data clustering and learning , 1998 .

[18]  Michael Werman,et al.  Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Marcello Pelillo,et al.  Dominant sets and hierarchical clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  Eman Abdu,et al.  A spectral-based clustering algorithm for categorical data using data summaries , 2009, DMMT '09.

[21]  Marcello Pelillo,et al.  Efficient Out-of-Sample Extension of Dominant-Set Clusters , 2004, NIPS.

[22]  Joachim M. Buhmann,et al.  Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..