Density-based Constraint Expansion Method for Semi-supervised Clustering

Most of the existing semi-supervised clustering methods neglect the structural information of the data,while the few constraints available may degrade the performance of the algorithms.This paper presents a Density-based Constraint Expansion(DCE) method.The dataset is represented by a graph.It introduces a density-based graph similarity.The constraint set is expanded by the similarity of the data samples.The expanded constraint set can be used in all semi-supervised clustering algorithms,including the constraint complete link algorithm and the pairwise constraint K means algorithm.Experimental results on several synthetic datasets and real-world datasets show that the DCE method can effectively enhance the performance of the semi-supervised clustering algorithms.