Structured Sparse Principal Component Analysis

We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes. This \emph{structured sparse PCA} is based on a structured regularization recently introduced by [1]. While classical sparse priors only deal with \textit{cardinality}, the regularization we use encodes higher-order information about the data. We propose an efficient and simple optimization procedure to solve this problem. Experiments with two practical tasks, face recognition and the study of the dynamics of a protein complex, demonstrate the benefits of the proposed structured approach over unstructured approaches.

[1]  Tong Zhang,et al.  Multi-stage Convex Relaxation for Learning with Sparse Regularization , 2008, NIPS.

[2]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[3]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[4]  Amnon Shashua,et al.  Nonnegative Sparse PCA , 2006, NIPS.

[5]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[6]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[7]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[8]  Thérèse E Malliavin,et al.  Dynamics and energetics: a consensus analysis of the impact of calcium on EF-CaM protein complex. , 2009, Biophysical journal.

[9]  Jean Ponce,et al.  Convex Sparse Matrix Factorizations , 2008, ArXiv.

[10]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[11]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[12]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[13]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[14]  Marc'Aurelio Ranzato,et al.  Learning invariant features through topographic filter maps , 2009, CVPR.

[15]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[16]  Volkan Cevher,et al.  Model-Based Compressive Sensing , 2008, IEEE Transactions on Information Theory.

[17]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[18]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[19]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[20]  Lawrence Carin,et al.  Exploiting Structure in Wavelet-Based Bayesian Compressive Sensing , 2009, IEEE Transactions on Signal Processing.

[21]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Lester W. Mackey,et al.  Deflation Methods for Sparse PCA , 2008, NIPS.

[23]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[24]  Geoffrey J. Gordon,et al.  A Unified View of Matrix Factorization Models , 2008, ECML/PKDD.

[25]  Guillermo Sapiro,et al.  Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations , 2009, NIPS.

[26]  Shai Avidan,et al.  Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms , 2005, NIPS.

[27]  Alexandre d'Aspremont,et al.  Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..

[28]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[29]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.