Novel Semi-supervised Clustering for High Dimensional Data

Semi-supervised clustering is a popular clustering method in recent year,which usually incorporates limited background knowledge to improve the clustering performance.However,most of existing methods based on neighbors or density can't be used for processing high dimensionality data.So it is critical of merging the reduced feature with semi-supervised clustering process.To solve the problem,we proposed a framework for semi-supervised clustering.The framework firstly preprocesses instances with transmissibility of constraints;then reduced dimensionality by projecting feature into low dimensional space;finally it clustered instances with reduced features.To evaluate the effectiveness of the method,we implemented experiments on datasets,the results show the method has good clustering performance for handling data of high dimension.