Supervised dimensionality reduction via sequential semidefinite programming

Many dimensionality reduction problems end up with a trace quotient formulation. Since it is difficult to directly solve the trace quotient problem, traditionally the trace quotient cost function is replaced by an approximation such that the generalized eigenvalue decomposition can be applied. In contrast, we directly optimize the trace quotient in this work. It is reformulated as a quasi-linear semidefinite optimization problem, which can be solved globally and efficiently using standard off-the-shelf semidefinite programming solvers. Also this optimization strategy allows one to enforce additional constraints (for example, sparseness constraints) on the projection matrix. We apply this optimization framework to a novel dimensionality reduction algorithm. The performance of the proposed algorithm is demonstrated in experiments by several UCI machine learning benchmark examples, USPS handwritten digits as well as ORL and Yale face data.

[1]  Gang Hua,et al.  Face Recognition using Discriminatively Trained Orthogonal Rank One Tensor Projections , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Naftali Tishby,et al.  Margin based feature selection - theory and algorithms , 2004, ICML.

[3]  B. Borchers CSDP, A C library for semidefinite programming , 1999 .

[4]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[5]  Michael L. Overton,et al.  On the Sum of the Largest Eigenvalues of a Symmetric Matrix , 1992, SIAM J. Matrix Anal. Appl..

[6]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[7]  Jieping Ye,et al.  Null space versus orthogonal linear discriminant analysis , 2006, ICML '06.

[8]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[9]  Michael L. Overton,et al.  Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices , 2015, Math. Program..

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11]  Daoqiang Zhang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[12]  Jiawei Han,et al.  Orthogonal Laplacianfaces for Face Recognition , 2006, IEEE Transactions on Image Processing.

[13]  Jing-Yu Yang,et al.  Face recognition based on the uncorrelated discriminant transformation , 2001, Pattern Recognit..

[14]  Christoph Schnörr,et al.  Learning non-negative sparse image codes by convex programming , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Luc Van Gool,et al.  SVM-based nonparametric discriminant analysis, an application to face detection , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Siegfried Schaible,et al.  Fractional Programming , 2009, Encyclopedia of Optimization.

[17]  Daniel Cremers,et al.  Binary Partitioning, Perceptual Grouping, and Restoration with Semidefinite Programming , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Shuicheng Yan,et al.  Trace Quotient Problems Revisited , 2006, ECCV.

[19]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[20]  S. Schaible Fractional Programming. II, On Dinkelbach's Algorithm , 1976 .

[21]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[22]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[23]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[25]  Jian Yang,et al.  What's wrong with Fisher criterion? , 2002, Pattern Recognit..

[26]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[27]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[28]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[29]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[30]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[31]  Christoph Schnörr,et al.  Controlling Sparseness in Non-negative Tensor Factorization , 2006, ECCV.

[32]  Jorge Cadima Departamento de Matematica Loading and correlations in the interpretation of principle compenents , 1995 .

[33]  David J. Kriegman,et al.  Practical Global Optimization for Multiview Geometry , 2006, International Journal of Computer Vision.