Fast and Robust Archetypal Analysis for Representation Learning

We revisit a pioneer unsupervised learning technique called archetypal analysis, [5] which is related to successful data analysis methods such as sparse coding [18] and non-negative matrix factorization [19]. Since it was proposed, archetypal analysis did not gain a lot of popularity even though it produces more interpretable models than other alternatives. Because no efficient implementation has ever been made publicly available, its application to important scientific problems may have been severely limited. Our goal is to bring back into favour archetypal analysis. We propose a fast optimization scheme using an active-set strategy, and provide an efficient open-source implementation interfaced with Matlab, R, and Python. Then, we demonstrate the usefulness of archetypal analysis for computer vision tasks, such as codebook learning, signal classification, and large image collection visualization.

[1]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[2]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[5]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[6]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[7]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[10]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[11]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  C. Ji An Archetypal Analysis on , 2005 .

[13]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  P. Groenen,et al.  Nearest convex hull classification , 2006 .

[15]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[18]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[20]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[21]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[22]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  F. Bach,et al.  Optimization with Sparsity-Inducing Penalties (Foundations and Trends(R) in Machine Learning) , 2011 .

[24]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[25]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Stéphane Mallat,et al.  Solving Inverse Problems With Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity , 2010, IEEE Transactions on Image Processing.

[27]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[28]  Guillermo Sapiro,et al.  See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .