A Hyperplane-Based Algorithm for Semi-Supervised Dimension Reduction

We consider the semi-supervised dimension reduction problem: given a high dimensional dataset with a small number of labeled data and huge number of unlabeled data, the goal is to find the low-dimensional embedding that yields good classification results. Most of the previous algorithms for this task are linkage-based algorithms. They try to enforce the must-link and cannot-link constraints in dimension reduction, leading to a nearest neighbor classifier in low dimensional space. In this paper, we propose a new hyperplane-based semi-supervised dimension reduction method—the main objective is to learn the low-dimensional features that can both approximate the original data and form a good separating hyperplane. We formulate this as a non-convex optimization problem and propose an efficient algorithm to solve it. The algorithm can scale to problems with millions of features and can easily incorporate non-negative constraints in order to learn interpretable non-negative features. Experiments on real world datasets demonstrate that our hyperplane-based dimension reduction method outperforms state-of-art linkage-based methods when very few labels are available.

[1]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[2]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[3]  Inderjit S. Dhillon,et al.  Robust Principal Component Analysis with Side Information , 2016, ICML.

[4]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[5]  Seungjin Choi,et al.  Semi-Supervised Nonnegative Matrix Factorization , 2010, IEEE Signal Processing Letters.

[6]  Andrzej Cichocki,et al.  Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[7]  Peijun Du,et al.  Semi-supervised dimensionality reduction for hyperspectral remote sensing image classification , 2012, 2012 4th Workshop on Hyperspectral Image and Signal Processing (WHISPERS).

[8]  Ian Davidson,et al.  Knowledge Driven Dimension Reduction for Clustering , 2009, IJCAI.

[9]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[10]  Jing Wang,et al.  Independent component analysis-based dimensionality reduction with applications in hyperspectral image analysis , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[12]  I. Jolliffe Principal Component Analysis , 2002 .

[13]  Inderjit S. Dhillon,et al.  Fast coordinate descent methods with variable selection for non-negative matrix factorization , 2011, KDD.

[14]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[15]  Inderjit S. Dhillon,et al.  Matrix Completion with Noisy Side Information , 2015, NIPS.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[18]  Chih-Jen Lin,et al.  Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[19]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[22]  Jaegul Choo,et al.  Weakly supervised nonnegative matrix factorization for user-driven clustering , 2014, Data Mining and Knowledge Discovery.

[23]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[24]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..