Robust path based semi-supervised dimensionality reduction

In many pattern recognition and data mining tasks, we often confront the problem of learning from a large amount of unlabeled data only with few pairwise constraints. This learning style is a kind of semi-supervised learning, and these pairwise constraints are called Side-Information. Generally speaking, these pairwise constraints are divided into two categories, one is called must-link if the pair of instances belongs to the same class, and the other is called cannot-link if the pair of instances belongs to different classes. Curse of dimensionality comes out simultaneously when the original data space is high, thus, many dimensionality reduction algorithms have proposed, and some of them utilize the side-information of the samples. However, the best learning result cannot be achieved only by using the side-information. So, we propose a novel algorithm called Robust Path Based Semi-Supervised Dimensionality Reduction (RPSSDR) in this paper. The proposed RPSSDR can not only utilize the pairwise constraints but also capture the manifold structure of the data by using robust path based similarity measure. A kernel extension of RPSSDR for the nonlinear dimensionality reduction is also presented. Besides, it can get a transformation matrix and handle unseen sample easily. Experimental results on high dimensional facial databases prove the effectiveness of our proposed method.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[3]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[4]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[5]  Dit-Yan Yeung,et al.  Semi-Supervised Discriminant Analysis using robust path-based similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Joachim M. Buhmann,et al.  Clustering with the Connectivity Kernel , 2003, NIPS.

[7]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[8]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[9]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[10]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[11]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[12]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[13]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Wei Tang,et al.  Pairwise Constraints-Guided Dimensionality Reduction , 2007 .

[15]  Hong Chang,et al.  Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints , 2006, Pattern Recognit..

[16]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[17]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[19]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  J. Friedman Regularized Discriminant Analysis , 1989 .

[21]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[22]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.