Siradüzensel Kümeleme İçin Yeni bir Uzaklik Ölçütü A New Distance Measure for Hierarchical Clustering

Support Vector Machine (SVM) classifier formulation is originally designed for binary classification, and the extension of it to the multi-class case is still an open research problem. Classical approaches such as one-against-one or one-againstall have been used to address the multi-class problem, but these approaches become less appealing when the number of classes in the training set is too large. Recent approaches use hierarchical based classification for the multi-class problems since they scale well with the number of classes. SVM based hierarchical classifiers involve the partition of data samples through a clustering algorithm, and classification performance of the overall system heavily depends on the generated clusters. The clustering methods such as k-means, kernel k-means, spherical shells and balanced subset clustering have been used for this goal, but their distance measures, which are used for partitioning the data samples, are not compatible with the SVM classification goal. This paper introduces a new distance measure for partition of data samples for SVM based hierarchical classification. Unlike other clustering methods used for this goal, our proposed method is suitable when SVMs are used as the base classifier. As demonstrated in the experiments, integrating the proposed clustering scheme into the hierarchical SVM classifiers significantly improves the computational efficiency with a small decrease in the recognition accuracy.