Non-Negative Sparse Regression and Column Subset Selection with L1 Error

We consider the problems of sparse regression and column subset selection under L1 error. For both problems, we show that in the non-negative setting it is possible to obtain tight and efficient approximations, without any additional structural assumptions (such as restricted isometry, incoherence, expansion, etc.). For sparse regression, given a matrix A and a vector b with non-negative entries, we give an efficient algorithm to output a vector x of sparsity O(k), for which |Ax - b|_1 is comparable to the smallest error possible using non-negative k-sparse x. We then use this technique to obtain our main result: an efficient algorithm for column subset selection under L1 error for non-negative matrices.

[1]  Jennifer H Barrett,et al.  Strategies for selecting subsets of single-nucleotide polymorphisms to genotype in association studies , 2005, BMC Genetics.

[2]  Sariel Har-Peled,et al.  Sparse Approximation via Generating Point Sets , 2015, SODA.

[3]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[4]  Deyu Meng,et al.  Robust Matrix Factorization with Unknown Noise , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Siddharth Barman,et al.  Approximating Nash Equilibria and Dense Bipartite Subgraphs via an Approximate Version of Caratheodory's Theorem , 2015, STOC.

[6]  Dean P. Foster,et al.  Variable Selection is Hard , 2014, COLT.

[7]  David P. Woodruff,et al.  Low rank approximation with entrywise l1-norm error , 2017, STOC.

[8]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Alan L. Yuille,et al.  Robust principal component analysis by self-organizing rules based on statistical physics approach , 1995, IEEE Trans. Neural Networks.

[10]  C. Darken,et al.  Constructive Approximation Rates of Convex Approximation in Non-hilbert Spaces , 2022 .

[11]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[12]  Silvio Lattanzi,et al.  Algorithms for $\ell_p$ Low-Rank Approximation , 2017, ICML.

[13]  Volkan Cevher,et al.  Sparse projections onto the simplex , 2012, ICML.

[14]  Petros Drineas,et al.  Column Selection via Adaptive Sampling , 2015, NIPS.

[15]  Cewu Lu,et al.  Scale Adaptive Dictionary Learning , 2014, IEEE Transactions on Image Processing.

[16]  Alan M. Frieze,et al.  Fast Monte-Carlo algorithms for finding low-rank approximations , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[17]  Jingdong Wang,et al.  A Probabilistic Approach to Robust Matrix Factorization , 2012, ECCV.

[18]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[19]  Dan Feldman,et al.  Dimensionality Reduction of Massive Sparse Datasets Using Coresets , 2015, NIPS.

[20]  H. Chipman,et al.  Interpretable dimension reduction , 2005 .

[21]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[22]  Yaniv Plan,et al.  Average-case hardness of RIP certification , 2016, NIPS.

[23]  Aditya Bhaskara,et al.  Sparse Solutions to Nonnegative Linear Systems and Applications , 2015, AISTATS.

[24]  Pascal Koiran,et al.  Hidden Cliques and the Certification of the Restricted Isometry Property , 2012, IEEE Transactions on Information Theory.

[25]  Xi Chen,et al.  Direct Robust Matrix Factorizatoin for Anomaly Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[26]  Dustin G. Mixon,et al.  Certifying the Restricted Isometry Property is Hard , 2012, IEEE Transactions on Information Theory.

[27]  Rahul Garg,et al.  Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property , 2009, ICML '09.

[28]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[29]  J. Brooks,et al.  A Pure L1-norm Principal Component Analysis. , 2013, Computational statistics & data analysis.

[30]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[31]  Jeff A. Bilmes,et al.  Submodular subset selection for large-scale speech training data , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Sanjeev Arora,et al.  A Practical Algorithm for Topic Modeling with Provable Guarantees , 2012, ICML.

[34]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[35]  David P. Woodruff,et al.  Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching , 2015, SODA.

[36]  Aditya Bhaskara,et al.  Greedy Column Subset Selection: New Bounds and Distributed Algorithms , 2016, ICML.

[37]  Anders P. Eriksson,et al.  Efficient Computation of Robust Weighted Low-Rank Matrix Approximations Using the L_1 Norm , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Tong Zhang,et al.  Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints , 2010, SIAM J. Optim..

[39]  Christos Boutsidis,et al.  Optimal CUR matrix decompositions , 2014, STOC.

[40]  Sanjeev Arora,et al.  Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[41]  Dit-Yan Yeung,et al.  Bayesian Robust Matrix Factorization for Image and Video Processing , 2013, 2013 IEEE International Conference on Computer Vision.