Parameterized Approximation Schemes for Clustering with General Norm Objectives

This paper considers the well-studied algorithmic regime of designing a $(1+\epsilon)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,\epsilon)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Bad\u{o}iu, Har-Peled, Indyk; STOC'02] as well as $k$-median, and $k$-means [Kumar, Sabharwal, Sen; J. ACM 2010]. However, existing EPASes handle only basic objectives (such as $k$-center, $k$-median, and $k$-means) and are tailored to the specific objective and metric space. Our main contribution is a clean and simple EPAS that settles more than ten clustering problems (across multiple well-studied objectives as well as metric spaces) and unifies well-known EPASes. Our algorithm gives EPASes for a large variety of clustering objectives (for example, $k$-means, $k$-center, $k$-median, priority $k$-center, $\ell$-centrum, ordered $k$-median, socially fair $k$-median aka robust $k$-median, or more generally monotone norm $k$-clustering) and metric spaces (for example, continuous high-dimensional Euclidean spaces, metrics of bounded doubling dimension, bounded treewidth metrics, and planar metrics). Key to our approach is a new concept that we call bounded $\epsilon$-scatter dimension--an intrinsic complexity measure of a metric space that is a relaxation of the standard notion of bounded doubling dimension. Our main technical result shows that two conditions are essentially sufficient for our algorithm to yield an EPAS on the input metric $M$ for any clustering objective: (i) The objective is described by a monotone (not necessarily symmetric!) norm, and (ii) the $\epsilon$-scatter dimension of $M$ is upper bounded by a function of $\epsilon$.

[1]  S. Vempala,et al.  Constant-Factor Approximation Algorithms for Socially Fair k-Clustering , 2022, ArXiv.

[2]  Hung Le,et al.  Low Treewidth Embeddings of Planar and Minor-Free Metrics , 2022, 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS).

[3]  Kasper Green Larsen,et al.  Towards optimal lower bounds for k-median and k-means coresets , 2022, STOC.

[4]  Bruno Ordozgoiti,et al.  Clustering with Fair-Center Representation: Parameterized Approximation Algorithms and Heuristics , 2021, KDD.

[5]  Yury Makarychev,et al.  Approximating Fair Clustering with Cascaded Norm Objectives , 2021, SODA.

[6]  A. Gionis,et al.  Diversity-aware k-median : Clustering with fair center representation , 2021, ECML/PKDD.

[7]  Ragesh Jaiswal,et al.  Tight FPT Approximation for Socially Fair Clustering , 2021, Inf. Process. Lett..

[8]  David Saulpic,et al.  A new coreset framework for clustering , 2021, STOC.

[9]  Deeparnab Chakrabarty,et al.  Revisiting Priority k-Center: Fairness and Outliers , 2021, ICALP.

[10]  Yury Makarychev,et al.  Approximation Algorithms for Socially Fair Clustering , 2021, COLT.

[11]  Rafail Ostrovsky,et al.  Min-Sum Clustering (with Outliers) , 2020, APPROX-RANDOM.

[12]  Philip N. Klein,et al.  On Light Spanners, Low-treewidth Embeddings and Efficient Traversing in Minor-free Graphs , 2020, 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS).

[13]  Aditya Bhaskara,et al.  Fair Clustering via Equitable Group Representations , 2020, FAccT.

[14]  Robert Krauthgamer,et al.  Coresets for Clustering in Excluded-minor Graphs and Beyond , 2020, SODA.

[15]  Nisheeth K. Vishnoi,et al.  Coresets for clustering in Euclidean spaces: importance sampling is nearly optimal , 2020, STOC.

[16]  S. KarthikC.,et al.  Inapproximability of Clustering in Lp Metrics , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[17]  Robert Krauthgamer,et al.  Coresets for Clustering in Graphs of Bounded Treewidth , 2019, ICML.

[18]  Kamesh Munagala,et al.  Proportionally Fair Clustering , 2019, ICML.

[19]  Michal Pilipczuk,et al.  Efficient approximation schemes for uniform-cost clustering problems in planar graphs , 2019, ESA.

[20]  Amit Kumar,et al.  Tight FPT Approximations for $k$-Median and k-Means , 2019, ICALP.

[21]  Robert Krauthgamer,et al.  Coresets for Ordered Weighted Clustering , 2019, ICML.

[22]  Deeparnab Chakrabarty,et al.  Fair Algorithms for Clustering , 2019, NeurIPS.

[23]  Philip N. Klein,et al.  Embedding Planar Graphs into Low-Treewidth Graphs with Applications to Efficient Approximation Schemes for Metric Problems , 2019, SODA.

[24]  Vincent Cohen-Addad,et al.  Approximation Schemes for Capacitated Clustering in Doubling Metrics , 2018, SODA.

[25]  Chaitanya Swamy,et al.  Approximation algorithms for minimum norm and ordered optimization problems , 2018, STOC.

[26]  Jaroslaw Byrka,et al.  Constant factor FPT approximation for capacitated k-median , 2018, ESA.

[27]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[28]  Jaroslaw Byrka,et al.  Constant-factor approximation for ordered k-median , 2017, STOC.

[29]  Shi Li,et al.  Constant approximation for k-median and k-means with outliers via iterative rounding , 2017, STOC.

[30]  Santosh S. Vempala,et al.  Efficient Convex Optimization with Membership Oracles , 2017, COLT.

[31]  Vangelis Th. Paschos,et al.  Structural Parameters, Tight Bounds, and Approximation for (k, r)-Center , 2017, ISAAC.

[32]  Mohammad R. Salavatipour,et al.  Local Search Yields a PTAS for k-Means in Doubling Metrics , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[33]  Shi Li,et al.  Constant Approximation for Capacitated k-Median with (1 + ε)-Capacity Violation , 2016, ArXiv.

[34]  Amit Kumar,et al.  Faster Algorithms for the Constrained k-means Problem , 2015, Theory of Computing Systems.

[35]  Ravishankar Krishnaswamy,et al.  The Hardness of Approximation of Euclidean k-Means , 2015, SoCG.

[36]  Jinhui Xu,et al.  A Unified Framework for Clustering Constrained Data Without Locality Property , 2015, Algorithmica.

[37]  Kurt Mehlhorn,et al.  New Approximability Results for the Robust k-Median Problem , 2013, SWAT.

[38]  Michael Langberg,et al.  A unified framework for approximating and clustering data , 2011, STOC.

[39]  Jian Li,et al.  Clustering with Diversity , 2010, ICALP.

[40]  Jochen Könemann,et al.  On generalizations of network design problems with degree bounds , 2010, Mathematical Programming.

[41]  Anupam Gupta,et al.  A Plant Location Guide for the Unsure: Approximation Algorithms for Min-Max Location Problems , 2010, Math. Oper. Res..

[42]  Piyush Kumar,et al.  An Algorithm and a Core Set Result for the Weighted Euclidean One-Center Problem , 2009, INFORMS J. Comput..

[43]  Dan Feldman,et al.  A PTAS for k-means clustering based on weak coresets , 2007, SCG '07.

[44]  R. Ostrovsky,et al.  The Effectiveness of Lloyd-Type Methods for the k-Means Problem , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[45]  Santosh S. Vempala,et al.  Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[46]  Inge Li Gørtz,et al.  Asymmetry in k-center variants , 2006, Theor. Comput. Sci..

[47]  Sariel Har-Peled,et al.  On coresets for k-means and k-median clustering , 2004, STOC '04.

[48]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[49]  Arie Tamir,et al.  The k-centrum multi-facility location problem , 2001, Discret. Appl. Math..

[50]  Ján Plesník,et al.  A heuristic for the p-center problems in graphs , 1987, Discret. Appl. Math..

[51]  Bar-Ilan Low Treewidth Embeddings of Planar and Minor-Free Metrics , 2022 .

[52]  Shichuan Deng,et al.  Ordered k-Median with Outliers , 2022, International Workshop and International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques.

[53]  Alan T. Murray Location Theory , 2020, International Encyclopedia of Human Geography.

[54]  Amit Kumar,et al.  Linear-time approximation schemes for clustering problems in any dimensions , 2010, JACM.

[55]  S. Dasgupta The hardness of k-means clustering , 2008 .

[56]  K. Clarkson Nearest-Neighbor Searching and Metric Space Dimensions , 2005 .

[57]  J. Matou On Approximate Geometric K-clustering , 1999 .

[58]  J. G. Pierce,et al.  Geometric Algorithms and Combinatorial Optimization , 2016 .