Intelligent control of the hierarchical agglomerative clustering process

The basic process of Hierarchical Agglomerative (HAG) clustering is described as a merging of clusters based on their proximity. The importance of the selected cluster distance measure in the determination of resulting clusters is pointed out. We note a fundamental distinction between the nearest neighbor cluster distance measure, Min, and the furthest neighbor measure, Max. The first favors the merging of large clusters while the later favors the merging of smaller clusters. We introduce a number of families of intercluster distance measures each of which can be parameterized along a scale characterizing their preference for merging larger or smaller clusters. We then consider the use of this distinction between distance measures as a way of controlling the hierarchical clustering process. Combining this with the ability of fuzzy systems modeling to formalize linguistic specifications, we see the emergence of a tool to add human like intelligence to the clustering process.

[1]  Didier Dubois,et al.  A review of fuzzy set aggregation connectives , 1985, Inf. Sci..

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  W. Pedrycz,et al.  Generalized means as model of compensative connectives , 1984 .

[4]  Ronald R. Yager,et al.  Essentials of fuzzy modeling and control , 1994 .

[5]  Ronald R. Yager,et al.  On mean type aggregation , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[7]  J. Kacprzyk,et al.  The Ordered Weighted Averaging Operators: Theory and Applications , 1997 .

[8]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[9]  Janusz Kacprzyk,et al.  The Ordered Weighted Averaging Operators , 1997 .

[10]  Michael Werman,et al.  An On-Line Agglomerative Clustering Method for Nonstationary Data , 1999, Neural Computation.

[11]  R. Yager Quantifier guided aggregation using OWA operators , 1996, Int. J. Intell. Syst..

[12]  L. Zadeh A COMPUTATIONAL APPROACH TO FUZZY QUANTIFIERS IN NATURAL LANGUAGES , 1983 .

[13]  Kenneth J. Arrow,et al.  Studies in Resource Allocation Processes: Appendix: An optimality criterion for decision-making under ignorance , 1977 .

[14]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[15]  R. Yager Quasi-associative operations in the combination of evidence , 1987 .

[16]  James M. Keller,et al.  Fuzzy Models and Algorithms for Pattern Recognition and Image Processing , 1999 .

[17]  R. Krishnapuram,et al.  Determination of the number of components in Gaussian mixtures using agglomerative clustering , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).