Probabilistic methods for approximate archetypal analysis

Archetypal analysis is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of archetypal analysis in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that, provided the data is approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of archetypal analysis. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of archetypal analysis. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.

[1]  C. Ji An Archetypal Analysis on , 2005 .

[2]  Tamara G. Kolda,et al.  A Practical Randomized CP Tensor Decomposition , 2017, SIAM J. Matrix Anal. Appl..

[3]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[4]  Wenjun Zeng,et al.  Online Dictionary Learning for Approximate Archetypal Analysis , 2018, ECCV.

[5]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  H. Rutishauser Simultaneous iteration method for symmetric matrices , 1970 .

[8]  Adam M. Oberman,et al.  Approximate Convex Hulls: sketching the convex hull using curvature , 2017 .

[9]  M. Rudelson,et al.  The Littlewood-Offord problem and invertibility of random matrices , 2007, math/0703503.

[10]  O Shoval,et al.  Evolutionary Trade-Offs, Pareto Optimality, and the Geometry of Phenotype Space , 2012, Science.

[11]  Dominique Zosso,et al.  Consistency of Archetypal Analysis , 2020, SIAM J. Math. Data Sci..

[12]  Ulf Brefeld,et al.  Frame-based Data Factorizations , 2017, ICML.

[13]  Volkan Cevher,et al.  Practical Sketching Algorithms for Low-Rank Matrix Approximation , 2016, SIAM J. Matrix Anal. Appl..

[14]  Cameron Musco,et al.  Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.

[15]  Roman Vershynin,et al.  High-Dimensional Probability , 2018 .

[16]  Andrea Montanari,et al.  Nonnegative Matrix Factorization Via Archetypal Analysis , 2017, Journal of the American Statistical Association.

[17]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[18]  Hyunjoong Kim,et al.  Functional Analysis I , 2017 .

[19]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[20]  Manuel J. A. Eugster,et al.  Weighted and robust archetypal analysis , 2011, Comput. Stat. Data Anal..

[21]  L. Khachiyan,et al.  The polynomial solvability of convex quadratic programming , 1980 .

[22]  Vinayak Abrol,et al.  A Geometric Approach to Archetypal Analysis via Sparse Projections , 2020, ICML.

[23]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[24]  Ulf Brefeld,et al.  Coresets for Archetypal Analysis , 2019, NeurIPS.

[25]  A. Foran,et al.  Quicksort , 1962, Comput. J..

[26]  Nikos Mamoulis,et al.  DSANLS: Accelerating Distributed Nonnegative Matrix Factorization via Sketching , 2018, WSDM.

[27]  Alexander J. Smola,et al.  Fast and Guaranteed Tensor Decomposition via Sketching , 2015, NIPS.

[28]  Konstantin Makarychev,et al.  Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering , 2018, STOC.

[29]  J. Nathan Kutz,et al.  Randomized nonnegative matrix factorization , 2017, Pattern Recognit. Lett..

[30]  Christian Bauckhage,et al.  Convex non-negative matrix factorization for massive datasets , 2011, Knowledge and Information Systems.

[31]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[32]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[33]  Philipp Birken,et al.  Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[34]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[35]  Christos Boutsidis,et al.  Random Projections for $k$-means Clustering , 2010, NIPS.

[36]  Manuel J. A. Eugster,et al.  From Spider-man to Hero - archetypal analysis in R , 2009 .

[37]  Lars Kai Hansen,et al.  Archetypal analysis for machine learning , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[38]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[39]  S. Muthukrishnan,et al.  Faster least squares approximation , 2007, Numerische Mathematik.

[40]  Michael B. Cohen,et al.  Dimensionality Reduction for k-Means Clustering and Low Rank Approximation , 2014, STOC.

[41]  Yuekai Sun,et al.  A Geometric Approach to Archetypal Analysis and Nonnegative Matrix Factorization , 2014, Technometrics.

[42]  Zaïd Harchaoui,et al.  Fast and Robust Archetypal Analysis for Representation Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Sivan Toledo,et al.  Blendenpik: Supercharging LAPACK's Least-Squares Solver , 2010, SIAM J. Sci. Comput..