论文信息 - Greedy Minimization of Weakly Supermodular Set Functions

Greedy Minimization of Weakly Supermodular Set Functions

This paper defines weak-$\alpha$-supermodularity for set functions. Many optimization objectives in machine learning and data mining seek to minimize such functions under cardinality constrains. We prove that such problems benefit from a greedy extension phase. Explicitly, let $S^*$ be the optimal set of cardinality $k$ that minimizes $f$ and let $S_0$ be an initial solution such that $f(S_0)/f(S^*) \le \rho$. Then, a greedy extension $S \supset S_0$ of size $|S| \le |S_0| + \lceil \alpha k \ln(\rho/\varepsilon) \rceil$ yields $f(S)/f(S^*) \le 1+\varepsilon$. As example usages of this framework we give new bicriteria results for $k$-means, sparse regression, and columns subset selection.

[1] Amos Fiat,et al. Bi-criteria linear-time approximations for generalized k-mean/median/center , 2007, SCG '07.

[2] Gene H. Golub,et al. Numerical methods for solving linear least squares problems , 1965, Milestones in Matrix Computation.

[3] Per Christian Hansen,et al. Some Applications of the Rank Revealing QR Factorization , 1992, SIAM J. Sci. Comput..

[4] Santosh S. Vempala,et al. Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[5] Christos Boutsidis,et al. Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[6] Ankit Aggarwal,et al. Adaptive Sampling for k-Means Clustering , 2009, APPROX-RANDOM.

[7] Maxim Sviridenko,et al. A Bi-Criteria Approximation Algorithm for k-Means , 2015, APPROX-RANDOM.

[8] Michael Langberg,et al. A unified framework for approximating and clustering data , 2011, STOC '11.

[9] Abhimanyu Das,et al. Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[10] Ming Gu,et al. Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[11] Tong Zhang,et al. Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints , 2010, SIAM J. Optim..

[12] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[13] M. L. Fisher,et al. An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[14] Jan Vondrák,et al. Optimal approximation for submodular and supermodular optimization with bounded curvature , 2013, SODA.

[15] Balas K. Natarajan,et al. Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[16] Luis Rademacher,et al. Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[17] W. Marsden. I and J , 2012 .

[18] Dean P. Foster,et al. Variable Selection is Hard , 2014, COLT.