Connected k-Center and k-Diameter Clustering

Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center problem and the connected $k$-diameter problem. We prove several results on the complexity and approximability of these problems. Our main result is an $O(\log^2{k})$-approximation algorithm for the connected $k$-center and the connected $k$-diameter problem. For Euclidean metrics and metrics with constant doubling dimension, the approximation factor of this algorithm improves to $O(1)$. We also consider the special cases that the connectivity graph is a line or a tree. For the line we give optimal polynomial-time algorithms and for the case that the connectivity graph is a tree, we either give an optimal polynomial-time algorithm or a $2$-approximation algorithm for all variants of our model. We complement our upper bounds by several lower bounds.

[1]  A. Gionis,et al.  Diversity-aware k-median : Clustering with fair center representation , 2021, ECML/PKDD.

[2]  Melanie Schmidt,et al.  Achieving anonymity via weak lower bound constraints for k-median and k-means , 2020, STACS.

[3]  Sagar Kale,et al.  How to Solve Fair k-Center in Massive Data Models , 2020, ICML.

[4]  Sara Ahmadian,et al.  Clustering without Over-Representation , 2019, KDD.

[5]  Kamesh Munagala,et al.  Proportionally Fair Clustering , 2019, ICML.

[6]  Krzysztof Onak,et al.  Scalable Fair Clustering , 2019, ICML.

[7]  Pranjal Awasthi,et al.  Fair k-Center Clustering for Data Summarization , 2019, ICML.

[8]  Deeparnab Chakrabarty,et al.  Fair Algorithms for Clustering , 2019, NeurIPS.

[9]  Samir Khuller,et al.  On the cost of essentially fair clusterings , 2018, APPROX-RANDOM.

[10]  Deeparnab Chakrabarty,et al.  Generalized Center Problems with Outliers , 2018, ICALP.

[11]  Jian Li,et al.  Epsilon-Coresets for Clustering (with Outliers) in Doubling Metrics , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[12]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[13]  Melanie Schmidt,et al.  Privacy preserving clustering with constraints , 2018, ICALP.

[14]  Shi Li,et al.  Constant approximation for k-median and k-means with outliers via iterative rounding , 2017, STOC.

[15]  S. Jevrejeva,et al.  Latest Developments from the Permanent Service for Mean Sea Level (PSMSL) , 2017 .

[16]  Chaitanya Swamy,et al.  Approximation Algorithms for Clustering Problems with Lower Bounds and Outliers , 2016, ICALP.

[17]  Ravishankar Krishnaswamy,et al.  The Non-Uniform k-Center Problem , 2016, ICALP.

[18]  Amit Kumar,et al.  Faster Algorithms for the Constrained k-means Problem , 2015, Theory of Computing Systems.

[19]  Jinhui Xu,et al.  A Unified Framework for Clustering Constrained Data Without Locality Property , 2015, Algorithmica.

[20]  Mohit Singh,et al.  LP-Based Algorithms for Capacitated Facility Location , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[21]  S. Jevrejeva,et al.  New Data Systems and Products at the Permanent Service for Mean Sea Level , 2013 .

[22]  Aditya Bhaskara,et al.  Centrality of trees for capacitated k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-center , 2014, Mathematical Programming.

[23]  Jian Li,et al.  Matroid and Knapsack Center Problems , 2013, Algorithmica.

[24]  Samir Khuller,et al.  LP Rounding for k-Centers with Non-uniform Hard Capacities , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[25]  Zhung-Xun Liao,et al.  Clustering spatial data with a geographic constraint: exploring local search , 2012, Knowledge and Information Systems.

[26]  Piotr Sankowski,et al.  Approximation Algorithms for Union and Intersection Covering Problems , 2011, FSTTCS.

[27]  Yogish Sabharwal,et al.  Clustering with Internal Connectedness , 2011, WALCOM.

[28]  Chaitanya Swamy,et al.  Improved Approximation Guarantees for Lower-Bounded Facility Location , 2011, WAOA.

[29]  Jian Li,et al.  Clustering with Diversity , 2010, ICALP.

[30]  Samir Khuller,et al.  Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity , 2008, APPROX-RANDOM.

[31]  Rong Ge,et al.  Joint cluster analysis of attribute data and relationship data , 2008, ACM Trans. Knowl. Discov. Data.

[32]  Zoya Svitkina,et al.  Lower-bounded facility location , 2008, SODA '08.

[33]  Samir Khuller,et al.  Algorithms for facility location problems with outliers , 2001, SODA '01.

[34]  Samir Khuller,et al.  The Capacitated K-Center Problem , 2000, SIAM J. Discret. Math..

[35]  Mihalis Yannakakis,et al.  Primal-dual approximation algorithms for integral flow and multicut in trees , 1997, Algorithmica.

[36]  David B. Shmoys,et al.  A unified approach to approximation algorithms for bottleneck problems , 1986, JACM.

[37]  Dorit S. Hochbaum,et al.  When are NP-hard location problems easy? , 1984, Ann. Oper. Res..

[38]  George L. Nemhauser,et al.  Easy and hard bottleneck location problems , 1979, Discret. Appl. Math..

[39]  Anupam Gupta,et al.  Approximation Algorithms for Aversion k-Clustering via Local k-Median , 2016, ICALP.

[40]  Amit Kumar,et al.  Facility Location with Matroid or Knapsack Constraints , 2015, Math. Oper. Res..

[41]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[42]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.