Projection algorithms for nonconvex minimization with application to sparse principal component analysis

We consider concave minimization problems over nonconvex sets. Optimization problems with this structure arise in sparse principal component analysis. We analyze both a gradient projection algorithm and an approximate Newton algorithm where the Hessian approximation is a multiple of the identity. Convergence results are established. In numerical experiments arising in sparse principal component analysis, it is seen that the performance of the gradient projection algorithm is very similar to that of the truncated power method and the generalized power method. In some cases, the approximate Newton algorithm with a Barzilai–Borwein Hessian approximation and a nonmonotone line search can be substantially faster than the other algorithms, and can converge to a better solution.

[1]  Yurii Nesterov,et al.  Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[2]  William W. Hager,et al.  A New Active Set Algorithm for Box Constrained Optimization , 2006, SIAM J. Optim..

[3]  Mark W. Schmidt,et al.  Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[4]  Elad Hazan,et al.  Projection-free Online Learning , 2012, ICML.

[5]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[6]  Jorge Cadima Departamento de Matematica Loading and correlations in the interpretation of principle compenents , 1995 .

[7]  Gert R. G. Lanckriet,et al.  A majorization-minimization approach to the sparse generalized eigenvalue problem , 2011, Machine Learning.

[8]  Aditya Bhaskara,et al.  Detecting high log-densities: an O(n¼) approximation for densest k-subgraph , 2010, STOC '10.

[9]  J. N. R. Jeffers,et al.  Two Case Studies in the Application of Principal Component Analysis , 1967 .

[10]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[11]  Alexandre d'Aspremont,et al.  Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..

[12]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[13]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[14]  Michael P. Friedlander,et al.  Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[15]  Marc Teboulle,et al.  Conditional Gradient Algorithmsfor Rank-One Matrix Approximations with a Sparsity Constraint , 2011, SIAM Rev..

[16]  Jiawei Zhang,et al.  Approximation of Dense-n/2-Subgraph and the Complement of Min-Bisection , 2003, J. Glob. Optim..

[17]  D. L. Donoho,et al.  Compressed sensing , 2006, IEEE Trans. Inf. Theory.

[18]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Francis R. Bach,et al.  Structured Sparse Principal Component Analysis , 2009, AISTATS.

[20]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[21]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[22]  R. Rockafellar Convex Analysis: (pms-28) , 1970 .

[23]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[24]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[25]  Akiko Takeda,et al.  Simultaneous pursuit of out-of-sample performance and sparsity in index tracking portfolios , 2012, Comput. Manag. Sci..

[26]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[27]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[28]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[29]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[30]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[31]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[32]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[33]  Marc Teboulle,et al.  Convex approximations to sparse PCA via Lagrangian duality , 2011, Oper. Res. Lett..

[34]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[35]  L. Grippo,et al.  A nonmonotone line search technique for Newton's method , 1986 .

[36]  Xiao-Tong Yuan,et al.  Truncated power method for sparse eigenvalue problems , 2011, J. Mach. Learn. Res..

[37]  Stephen J. Wright,et al.  Sparse reconstruction by separable approximation , 2009, IEEE Trans. Signal Process..

[38]  D. Bertsekas Projected Newton methods for optimization problems with simple constraints , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.