Column Subset Selection Problem is UG-hard

We address two problems related to selecting an optimal subset of columns from a matrix. In one of these problems, we are given a matrix A ? R m i? n and a positive integer k, and we want to select a sub-matrix C of k columns to minimize ? A - ? C A ? F , where ? C = C C + denotes the matrix of projection onto the space spanned by C. In the other problem, we are given A ? R m i? n , positive integers c and r, and we want to select sub-matrices C and R of c columns and r rows of A, respectively, to minimize ? A - C U R ? F , where U ? R c i? r is the pseudo-inverse of the intersection between C and R. Although there is a plethora of algorithmic results, the complexity of these problems has not been investigated thus far. We show that these two problems are NP-hard assuming UGC. Select a subset of columns/rows of a matrix so that they represent the matrix well.Formulated as Column Subset Selection Problem and Column-Row Subset Selection Problem.Unique Games Conjecture implies that there is no PTAS.First complexity theoretic result of this kind for these problems.

[1]  Per Christian Hansen,et al.  Low-rank revealing QR factorizations , 1994, Numer. Linear Algebra Appl..

[2]  S. Muthukrishnan,et al.  Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods , 2006, APPROX-RANDOM.

[3]  Ryan O'Donnell,et al.  Optimal Inapproximability Results for MAX-CUT and Other 2-Variable CSPs? , 2007, SIAM J. Comput..

[4]  Dimitris Achlioptas,et al.  Fast computation of low-rank matrix approximations , 2007, JACM.

[5]  T. Chan Rank revealing QR factorizations , 1987 .

[6]  P. Tang,et al.  Bounds on Singular Values Revealed by QR Factorizations , 1999 .

[7]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[8]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES III: COMPUTING A COMPRESSED APPROXIMATE MATRIX DECOMPOSITION∗ , 2004 .

[9]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[10]  Stephen P. Boyd,et al.  Sensor Selection via Convex Optimization , 2009, IEEE Transactions on Signal Processing.

[11]  M. Rudelson Random Vectors in the Isotropic Position , 1996, math/9608208.

[12]  Per Christian Hansen,et al.  Computing Truncated Singular Value Decomposition Least Squares Solutions by Rank Revealing QR-Factorizations , 1990, SIAM J. Sci. Comput..

[13]  Ilse C. F. Ipsen,et al.  On Rank-Revealing Factorisations , 1994, SIAM J. Matrix Anal. Appl..

[14]  Ming Gu,et al.  Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[15]  Malik Magdon-Ismail,et al.  Exponential Inapproximability of Selecting a Maximum Volume Sub-matrix , 2011, Algorithmica.

[16]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[17]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  Sanjeev Arora,et al.  Probabilistic checking of proofs: a new characterization of NP , 1998, JACM.

[20]  Johan Håstad,et al.  Some optimal inapproximability results , 2001, JACM.

[21]  Ilse C. F. Ipsen,et al.  Rank-Deficient Nonlinear Least Squares Problems and Subset Selection , 2011, SIAM J. Numer. Anal..

[22]  Malik Magdon-Ismail,et al.  Column subset selection via sparse approximation of SVD , 2012, Theor. Comput. Sci..

[23]  Prasad Raghavendra,et al.  Bypassing UGC from Some Optimal Geometric Inapproximability Results , 2016, TALG.

[24]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[25]  Malik Magdon-Ismail,et al.  On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[26]  Yuval Rabani,et al.  ON THE HARDNESS OF APPROXIMATING MULTICUT AND SPARSEST-CUT , 2005, 20th Annual IEEE Conference on Computational Complexity (CCC'05).

[27]  Mihir Bellare,et al.  Free Bits, PCPs, and Nonapproximability-Towards Tight Results , 1998, SIAM J. Comput..

[28]  Luca Trevisan,et al.  Approximation algorithms for unique games , 2005, IEEE Annual Symposium on Foundations of Computer Science.

[29]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[30]  F. Hoog,et al.  Subset selection for matrices , 2007 .

[31]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[32]  Carsten Lund,et al.  Hardness of approximations , 1996 .

[33]  Per Christian Hansen,et al.  Some Applications of the Rank Revealing QR Factorization , 1992, SIAM J. Sci. Comput..

[34]  C. Pan,et al.  Rank-Revealing QR Factorizations and the Singular Value Decomposition , 1992 .

[35]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[36]  Leslie Foster,et al.  Algorithm 853: An efficient algorithm for solving rank-deficient least squares problems , 2006, TOMS.

[37]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[38]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[39]  Anupam Gupta,et al.  Approximating unique games , 2006, SODA '06.

[40]  S. Goreinov,et al.  Pseudo-skeleton approximations by matrices of maximal volume , 1997 .

[41]  G. W. Stewart,et al.  Four algorithms for the the efficient computation of truncated pivoted QR approximations to a sparse matrix , 1999, Numerische Mathematik.

[42]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[43]  S. Schreiber,et al.  Vector algebra in the analysis of genome-wide expression data , 2002, Genome Biology.

[44]  S. Goreinov,et al.  A Theory of Pseudoskeleton Approximations , 1997 .

[45]  Moses Charikar,et al.  Near-optimal algorithms for unique games , 2006, STOC '06.

[46]  Ioannis Koutis,et al.  Parameterized complexity and improved inapproximability for computing the largest j-simplex in a V-polytope , 2006, Inf. Process. Lett..

[47]  Carsten Lund,et al.  Proof verification and the hardness of approximation problems , 1998, JACM.

[48]  Subhash Khot On the power of unique 2-prover 1-round games , 2002, STOC '02.