SSPS: A Semi-Supervised Pattern Shift for Classification

Recently, a great amount of efforts have been spent in the research of unsupervised and (semi-)supervised dimensionality reduction (DR) techniques, and DR as a preprocessor is widely applied into classification learning in practice. However, on the one hand, many DR approaches cannot necessarily lead to a better classification performance. On the other hand, DR often suffers from the problem of estimation of retained dimensionality for real-world data. Alternatively, in this paper, we propose a new semi-supervised data preprocessing technique, named semi-supervised pattern shift (SSPS). The advantages of SSPS lie in the fact that not only the estimation of retained dimensionality can be avoided naturally, but a new shifted pattern representation that may be more favorable to classification is obtained as well. As a further extension of SSPS, we develop its fast and out-of-sample versions respectively, both of which are based on a shape-preserved subset selection trick. The final experimental results demonstrate that the proposed SSPS is promising and effective in classification application.

[1]  Feiping Nie,et al.  A unified framework for semi-supervised dimensionality reduction , 2008, Pattern Recognit..

[2]  Nicolas Le Roux,et al.  Large-Scale Algorithms , 2006, Semi-Supervised Learning.

[3]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[4]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.

[5]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[6]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[7]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[8]  Bernhard Schölkopf,et al.  1 Discrete Regularization , 2006 .

[9]  Bernhard Schölkopf,et al.  A Discussion of Semi-Supervised Learning and Transduction , 2006, Semi-Supervised Learning.

[10]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[11]  Jian Yang,et al.  A transductive framework of distance metric learning by spectral dimensionality reduction , 2007, ICML '07.

[12]  Ivor W. Tsang,et al.  Efficient kernel feature extraction for massive data sets , 2006, KDD '06.

[13]  Shuicheng Yan,et al.  Graph embedding: a general framework for dimensionality reduction , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[15]  L. J. P. van der Maaten,et al.  An Introduction to Dimensionality Reduction Using Matlab , 2007 .

[16]  Jun Zhang,et al.  Text Classification Based on Nonlinear Dimensionality Reduction Techniques and Support Vector Machines , 2007, Third International Conference on Natural Computation (ICNC 2007).

[17]  Chris H. Q. Ding,et al.  Adaptive dimension reduction for clustering high dimensional data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[18]  Zhong Jin,et al.  Locality Preserving Projections Based on L1 Graph , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[19]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[20]  Bernhard Schölkopf,et al.  Analysis of Benchmarks , 2006, Semi-Supervised Learning.

[21]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[22]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[23]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[25]  Jieping Ye,et al.  Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[27]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[28]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[29]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[30]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[31]  Alexander Zien,et al.  Label Propagation and Quadratic Criterion , 2006 .

[32]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[33]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[34]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .

[35]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[36]  Robert P. W. Duin,et al.  Object Representation, Sample Size, and Data Set Complexity , 2006 .

[37]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[38]  Tat-Jun Chin,et al.  Out-of-Sample Extrapolation of Learned Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[40]  Markus Breitenbach,et al.  Clustering through ranking on manifolds , 2005, ICML '05.