Dependent randomized rounding for clustering and partition systems with knapsack constraints

Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on fairness in machine learning and AI; one representative notion of fairness is that no single demographic group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with "knapsack" and "partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds are very useful in inferring properties of large networks using few samples.

[1]  Dana Ron,et al.  Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection , 2016, ICALP.

[2]  Desh Ranjan,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[3]  József Beck,et al.  "Integer-making" theorems , 1981, Discret. Appl. Math..

[4]  Brian Garnett Small deviations of sums of independent random variables , 2020, J. Comb. Theory, Ser. A.

[5]  Jan Vondrák,et al.  Dependent Randomized Rounding via Exchange Properties of Combinatorial Structures , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[6]  Aravind Srinivasan,et al.  An extension of the Lovász local lemma, and its applications to integer programming , 1996, SODA '96.

[7]  Shi Li,et al.  Constant approximation for k-median and k-means with outliers via iterative rounding , 2017, STOC.

[8]  Maxim Sviridenko,et al.  Pipage Rounding: A New Method of Constructing Algorithms with Proven Performance Guarantee , 2004, J. Comb. Optim..

[9]  Dana Ron,et al.  Approximating average parameters of graphs , 2008, Random Struct. Algorithms.

[10]  Amin Saberi,et al.  A new greedy approach for facility location problems , 2002, STOC '02.

[11]  Uriel Feige,et al.  On sums of independent random variables with unbounded variance, and estimating the average degree in a graph , 2004, STOC '04.

[12]  Aravind Srinivasan,et al.  An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization , 2014, SODA.

[13]  K. Joag-dev,et al.  Negative Association of Random Variables with Applications , 1983 .

[14]  Dana Ron,et al.  Sublinear Time Estimation of Degree Distribution Moments: The Arboricity Connection , 2016, SIAM J. Discret. Math..

[15]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[16]  Aravind Srinivasan,et al.  Chernoff-Hoeffding bounds for applications with limited independence , 1995, SODA '93.

[17]  Shi Li,et al.  Approximating k-median via pseudo-approximation , 2012, STOC '13.

[18]  Aravind Srinivasan,et al.  Fault-Tolerant Facility Location: A Randomized Dependent LP-Rounding Algorithm , 2010, IPCO.

[19]  Richard M. Karp,et al.  Global wire routing in two-dimensional arrays , 1987, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[20]  Jiawei Zhang,et al.  Bounding Probability of Small Deviation: A Fourth Moment Approach , 2010, Math. Oper. Res..

[21]  S. M. Samuels On a Chebyshev-Type Inequality for Sums of Independent Random Variables , 1966 .

[22]  Fabián A. Chudak,et al.  Improved Approximation Algorithms for the Uncapacitated Facility Location Problem , 2003, SIAM J. Comput..

[23]  Nikhil Bansal,et al.  Approximation-Friendly Discrepancy Rounding , 2015, IPCO.

[24]  Aravind Srinivasan,et al.  Approximation algorithms for stochastic clustering , 2018, NeurIPS.

[25]  David B. Shmoys,et al.  A unified approach to approximation algorithms for bottleneck problems , 1986, JACM.

[26]  Nikhil Bansal,et al.  On a generalization of iterated and randomized rounding , 2018, STOC.

[27]  Amit Kumar,et al.  The matroid median problem , 2011, SODA '11.

[28]  Aravind Srinivasan,et al.  Distributions on level-sets with applications to approximation algorithms , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[29]  Samir Khuller,et al.  Fault tolerant K-center problems , 2000, Theor. Comput. Sci..

[30]  Aravind Srinivasan,et al.  New algorithmic aspects of the Local Lemma with applications to routing and partitioning , 1999, SODA '99.

[31]  David P. Williamson,et al.  The Design of Approximation Algorithms , 2011 .

[32]  Samir Khuller,et al.  Fault tolerant K-center problems , 1997, Theor. Comput. Sci..

[33]  Shi Li,et al.  A Dependent LP-Rounding Approach for the k-Median Problem , 2012, ICALP.

[34]  Jian Li,et al.  Matroid and Knapsack Center Problems , 2013, Algorithmica.

[35]  Rajiv Gandhi,et al.  Dependent rounding and its applications to approximation algorithms , 2006, JACM.

[36]  Noga Alon,et al.  Nonnegative k-sums, fractional covers, and probability of small deviations , 2012, J. Comb. Theory, Ser. B.

[37]  Jan Vondrák,et al.  Multi-budgeted matchings and matroid intersection via dependent rounding , 2011, SODA '11.

[38]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.