Mining Multiple Clustering Data for Knowledge Discovery

Clustering has been widely used for knowledge discovery. In this paper, we propose an effective approach known as Multi-Clustering to mine the data generated from different clustering methods for discovering relationships between clusters of data. In the proposed Multi-Clustering technique, it first generates combined vectors from the multiple clustering data. Then, the distances between the combined vectors are calculated using the Mahalanobis distance. The Agglomerative Hierarchical Clustering method is used to cluster the combined vectors. And finally, relationship vectors that can be used to identify the cluster relationships are generated. To illustrate the technique, we also discuss an application example that uses the proposed Multi-Clustering technique to mine the author clusters and document clusters for identifying the relationships on authors working on research areas. The performance of the proposed technique is also evaluated.

[1]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[2]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[3]  Robert C. Kohberger,et al.  Cluster Analysis (3rd ed.) , 1994 .

[4]  Fatos T. Yarman-Vural,et al.  Learning similarity space , 2002, Proceedings. International Conference on Image Processing.

[5]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[6]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[7]  S. C. Hui,et al.  Mining a Web Citation Database for author co-citation analysis , 2002, Inf. Process. Manag..

[8]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[9]  S. Grossberg The Adaptive Self-Organization of Serial Order in Behavior: Speech, Language, And Motor Control , 1987 .

[10]  Daniel Boley,et al.  Principal Direction Divisive Partitioning , 1998, Data Mining and Knowledge Discovery.

[11]  Malik Beshir Malik,et al.  Applied Linear Regression , 2005, Technometrics.

[12]  Eileen C. Schwab,et al.  Pattern recognition by humans and machines , 1986 .

[13]  Brian Everitt,et al.  Cluster analysis , 1974 .

[14]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[15]  Siu Cheung Hui,et al.  Mining a web citation database for document clustering , 2002, Appl. Artif. Intell..

[16]  Joanne L. Miller,et al.  Speech Perception , 1990, Springer Handbook of Auditory Research.