Approximating $(k,\ell)$-center clustering for curves

The Euclidean $k$-center problem is a classical problem that has been extensively studied in computer science. Given a set $\mathcal{G}$ of $n$ points in Euclidean space, the problem is to determine a set $\mathcal{C}$ of $k$ centers (not necessarily part of $\mathcal{G}$) such that the maximum distance between a point in $\mathcal{G}$ and its nearest neighbor in $\mathcal{C}$ is minimized. In this paper we study the corresponding $(k,\ell)$-center problem for polygonal curves under the Frechet distance, that is, given a set $\mathcal{G}$ of $n$ polygonal curves in $\mathbb{R}^d$, each of complexity $m$, determine a set $\mathcal{C}$ of $k$ polygonal curves in $\mathbb{R}^d$, each of complexity $\ell$, such that the maximum Frechet distance of a curve in $\mathcal{G}$ to its closest curve in $\mathcal{C}$ is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension $2$ and higher. We show that, if $\ell$ is part of the input, then there is no polynomial-time approximation scheme unless $\mathsf{P}=\mathsf{NP}$. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Frechet distance. In the case of the discrete Frechet distance on two-dimensional curves, we show hardness of approximation within a factor close to $2.598$. This result also holds when $k=1$, and the $\mathsf{NP}$-hardness extends to the case that $\ell=\infty$, i.e., for the problem of computing the minimum-enclosing ball under the Frechet distance. Finally, we observe that a careful adaptation of Gonzalez' algorithm in combination with a curve simplification yields a $3$-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.

[1]  Julien Jacques,et al.  Functional data clustering: a survey , 2013, Advances in Data Analysis and Classification.

[2]  George L. Nemhauser,et al.  Easy and hard bottleneck location problems , 1979, Discret. Appl. Math..

[3]  Pierre Gançarski,et al.  A global averaging method for dynamic time warping, with applications to clustering , 2011, Pattern Recognit..

[4]  Marvin Künnemann,et al.  Improved Approximation for Fréchet Distance on c-Packed Curves Matching Conditional Lower Bounds , 2014, Int. J. Comput. Geom. Appl..

[5]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[6]  Christian Sohler,et al.  Clustering time series under the Fréchet distance , 2015, SODA.

[7]  Sergey Bereg,et al.  Simplifying 3D Polygonal Chains Under the Discrete Fréchet Distance , 2008, LATIN.

[8]  Esko Ukkonen,et al.  The Shortest Common Supersequence Problem over Binary Alphabet is NP-Complete , 1981, Theor. Comput. Sci..

[9]  Leonidas J. Guibas,et al.  Approximating Polygons and Subdivisions with Minimum Link Paths , 1991, Int. J. Comput. Geom. Appl..

[10]  Pierre Gançarski,et al.  Summarizing a set of time series by averaging: From Steiner sequence to compact multiple alignment , 2012, Theor. Comput. Sci..

[11]  D. Biro,et al.  Landscape complexity influences route-memory formation in navigating pigeons , 2014, Biology Letters.

[12]  C. Abraham,et al.  Unsupervised Curve Clustering using B‐Splines , 2003 .

[13]  Michael Godau,et al.  A Natural Metric for Curves - Computing the Distance for Polygonal Chains and Approximation Algorithms , 1991, STACS.

[14]  Wolfgang Mulzer,et al.  Approximability of the discrete Fréchet distance , 2015, J. Comput. Geom..

[15]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[16]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[17]  Nabil H. Mustafa,et al.  Near-Linear Time Approximation Algorithms for Curve Simplification , 2005, Algorithmica.

[18]  Helmut Alt,et al.  Computing the Fréchet distance between two polygonal curves , 1995, Int. J. Comput. Geom. Appl..

[19]  Sariel Har-Peled Geometric Approximation Algorithms , 2011 .

[20]  Jeng-Min Chiou,et al.  Functional clustering and identifying substructures of longitudinal data , 2007 .

[21]  Roman Garnett,et al.  Objectively identifying landmark use and predicting flight trajectories of the homing pigeon using Gaussian processes , 2010, Journal of The Royal Society Interface.

[22]  Maike Buchin,et al.  A Middle Curve Based on Discrete Fréchet Distance , 2016, LATIN.

[23]  Pankaj K. Agarwal,et al.  Exact and Approximation Algortihms for Clustering , 1997 .

[24]  Wolfgang Mulzer,et al.  Four Soviets Walk the Dog: Improved Bounds for Computing the Fréchet Distance , 2012, Discret. Comput. Geom..

[25]  Luis Angel García-Escudero,et al.  A Proposal for Robust Curve Clustering , 2005, J. Classif..