Improvement of Hierarchical Clustering Results by Refinement of Variable Types and Distance Measures
暂无分享,去创建一个
Hierarchical clustering method is used to assign observations into clusters further connected to form a hierarchical structure. Observations in the same cluster are close together according to the predetermined distance measure, while observations belonging to different clusters are afar. This paper presents an implementation of specific distance measure used to calculate distances between observations which are described by a mixture of variable types. Data mining tool ‘Orange’ was used for implementation, testing, data processing and result visualization. Finally, a comparison was made between results obtained by using already available widget and the output of newly programmed widget which employs new variable types and new distance measure. The comparison was made on different well-known datasets.
[1] Ali S. Hadi,et al. Finding Groups in Data: An Introduction to Chster Analysis , 1991 .
[2] J. Gower. A General Coefficient of Similarity and Some of Its Properties , 1971 .
[3] Blaz Zupan,et al. Orange: From Experimental Machine Learning to Interactive Data Mining , 2004, PKDD.
[4] Ethem Alpaydin,et al. Introduction to Machine Learning (Adaptive Computation and Machine Learning) , 2004 .