Semi-Supervised Dimensionality Reduction Using Pairwise Equivalence Constraints

To deal with the problem of insufficient labeled data, usually side information -- given in the form of pairwise equivalence constraints between points -- is used to discover groups within data. However, existing methods using side information typically fail in cases with high-dimensional spaces. In this paper, we address the problem of learning from side information for high-dimensional data. To this end, we propose a semi-supervised dimensionality reduction scheme that incorporates pairwise equivalence constraints for finding a better embedding space, which improves the performance of subsequent clustering and classification phases. Our method builds on the assumption that points in a sufficiently small neighborhood tend to have the same label. Equivalence constraints are employed to modify the neighborhoods and to increase the separability of different classes. Experimental results on high-dimensional image data sets show that integrating side information into the dimensionality reduction improves the clustering and classification performance.

[1]  Ivor W. Tsang,et al.  Distance metric learning with kernels , 2003 .

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[4]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[5]  Cordelia Schmid,et al.  A maximum entropy framework for combining parts and relations for texture and object recognition , 2005 .

[6]  Dima Damen,et al.  Detecting Carried Objects in Short Video Sequences , 2008, ECCV.

[7]  Carlotta Domeniconi,et al.  Subspace Metric Ensembles for Semi-supervised Clustering of High Dimensional Data , 2006, ECML.

[8]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[9]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[10]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[11]  B. Schiele,et al.  Interleaved Object Categorization and Segmentation , 2003, BMVC.

[12]  Daphna Weinshall,et al.  Enhancing image and video retrieval: learning via equivalence constraints , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.

[14]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[15]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[16]  Hakan Cevikalp,et al.  Manifold Based Local Classifiers: Linear and Nonlinear Approaches , 2010, J. Signal Process. Syst..

[17]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .