Clustering under Perturbation Resilience

Motivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [Proceedings of the Symposium on Innovations in Computer Science, 2010] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. The hope is that by exploiting the structure in such instances, one can overcome worst case hardness results. In this paper, we provide several results within this framework. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to perturbations of factor $(1 + \sqrt{2})$, solving an open problem of Awasthi, Blum, and Sheffet [Proceedings of the IEEE Annual Symposium on Foundations of Computer Science, 2010]. For $k$-median, a center-based objective of special interest, we additionally give algorithms for a more relaxed assumption in which ...