Sparse Projections over Graph

Recent study has shown that canonical algorithms such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) can be obtained from graph based dimensionality reduction framework. However, these algorithms yield projective maps which are linear combination of all the original features. The results are difficult to be interpreted psychologically and physiologically. This paper presents a novel technique for learning a sparse projection over graphs. The data in the reduced subspace is represented as a linear combination of a subset of the most relevant features. Comparing to PCA and LDA, the results obtained by sparse projection are often easier to be interpreted. Our algorithm is based on a graph embedding model, which encodes the discriminating and geometrical structure in terms of the data affinity. Once the embedding results are obtained, we then apply regularized regression for learning a set of sparse basis functions. Specifically, by using L1-norm regularizer (e.g. lasso), the sparse projections can be efficiently computed. Experimental results on two document databases demonstrate the effectiveness of our method.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Shai Avidan,et al.  Generalized spectral bounds for sparse LDA , 2006, ICML.

[3]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[4]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[5]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[6]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[8]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[9]  Shai Avidan,et al.  Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms , 2005, NIPS.

[10]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[11]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Fan Chung,et al.  Spectral Graph Theory , 1996 .