Classification of Weakly-Labeled Data with Partial Equivalence Relations

In many vision problems, instead of having fully labeled training data it is easier to obtain the input in small groups, where the data in each group is constrained to be from the same class but the actual class label is not known. Such constraints give rise to partial equivalence relations. The absence of class labels prevents the use of standard discriminative methods in this scenario. On the other hand, the state-of-the-art techniques that use partial equivalence relations, e.g., relevant component analysis, learn projections that are optimal for data representation, but not discrimination. We show that this leads to poor performance in several real-world applications, especially those with high-dimensional data. In this paper, we present a novel discriminative technique for the classification of weakly-labeled data which exploits the null-space of data scatter matrices to achieve good classification accuracy. We demonstrate the superior performance of both linear and nonlinear versions of our approach on face recognition, clustering, and image retrieval tasks. Results are reported on standard datasets as well as real-world images and videos from the Web.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Gene H. Golub,et al.  Matrix computations , 1983 .

[3]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[5]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[6]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[7]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[8]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[9]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[10]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[11]  Hanqing Lu,et al.  Solving the small sample size problem of LDA , 2002, Object recognition supported by user interaction for service robots.

[12]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[13]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[14]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[15]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[16]  Tieniu Tan,et al.  Null space-based kernel Fisher discriminant analysis for face recognition , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[17]  Daphna Weinshall,et al.  Learning distance functions for image retrieval , 2004, CVPR 2004.

[18]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, CVPR.

[19]  I. Tsang,et al.  Kernel relevant component analysis for distance metric learning , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[20]  Jieping Ye,et al.  Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis , 2006, J. Mach. Learn. Res..