Locality Preserving Feature Learning

Locality Preserving Indexing (LPI) has been quite successful in tackling document analysis problems, such as clustering or classification. The approach relies on the Locality Preserving Criterion, which preserves the locality of the data points. However, LPI takes every word in a data corpus into account, even though many words may not be useful for document clustering. To overcome this problem, we propose an approach called Locality Preserving Feature Learning (LPFL), which incorporates feature selection into LPI. Specifically, we aim to find a subset of features, and learn a linear transformation to optimize the Locality Preserving Criterion based on these features. The resulting optimization problem is a mixed integer programming problem, which we relax into a constrained Frobenius norm minimization problem, and solve using a variation of Alternating Direction Method (ADM). ADM, which iteratively updates the linear transformation matrix, the residue matrix and the Lagrangian multiplier, is theoretically guaranteed to converge at the rate O( 1 t ). Experiments on benchmark document datasets show that our proposed method outperforms LPI, as well as other state-of-the-art document analysis approaches.

[1]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[2]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[4]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[5]  Jennifer G. Dy,et al.  From Transformation-Based Dimensionality Reduction to Feature Selection , 2010, ICML.

[6]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[7]  Jiawei Han,et al.  Linear Discriminant Dimensionality Reduction , 2011, ECML/PKDD.

[8]  Jiawei Han,et al.  Joint Feature Selection and Subspace Learning , 2011, IJCAI.

[9]  Hongtao Lu,et al.  A Fast Algorithm for Recovery of Jointly Sparse Vectors based on the Alternating Direction Methods , 2011, AISTATS.

[10]  Jiawei Han,et al.  Regularized locality preserving indexing via spectral regression , 2007, CIKM '07.

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[13]  Jiawei Han,et al.  Spectral Regression: A Unified Approach for Sparse Subspace Learning , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[14]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[15]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[16]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[17]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[18]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[19]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[20]  Xi Chen,et al.  Text classification with kernels on the multinomial manifold , 2005, SIGIR '05.

[21]  R. Glowinski,et al.  Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics , 1987 .

[22]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[23]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[24]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.