Algorithms for dynamic geometric problems over data streams

Computing over data streams is a recent phenomenon that is of growing interest in many areas of computer science, including databases, computer networks and theory of algorithms. In this scenario, it is assumed that the algorithm sees the elements of the input one-by-one in arbitrary order, and needs to compute a certain function of the input. However, it does not have enough memory to store the whole input. Therefore, it must maintain a “sketch” of the data. Designing a sketching method for a given problem is a novel and exciting challenge for algorithm design. The initial research in streaming algorithms has focused on computing simple numerical statistics of the input, like median [23], number of distinct elements [11] or frequency moments [1]. More recently, the researchers showed that one can use those algorithms as subroutines to solve more complex problems (e.g., cf. [13]); see the survey [24] for detailed description of the past and recent developments. Still, the scope of algorithmic problems for which stream algorithms exist is not well understood. It is therefore of importance to identify new classes of problems that can be solved in this restricted settings. In this paper we investigate stream algorithms for dynamic geometric problems. Specifically, we present low-storage data structures that maintain approximate solutions to geometric problems, under insertions and deletions of points (this is called a turnstile model in [24]). From the data stream perspective, the stream consists of m operations, each of them is either Add(p) (that adds p to the current

[1]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[2]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[3]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[4]  Wei Hong,et al.  The sensor spectrum: technology, trends, and requirements , 2003, SGMD.

[5]  Piotr Indyk,et al.  Comparing Data Streams Using Hamming Norms (How to Zero In) , 2002, VLDB.

[6]  Rina Panigrahy,et al.  Better streaming algorithms for clustering problems , 2003, STOC '03.

[7]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[8]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[9]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[10]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[11]  Adam Meyerson,et al.  Online facility location , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[12]  Yair Bartal,et al.  Probabilistic approximation of metric spaces and its algorithmic applications , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[13]  Kamesh Munagala,et al.  Local search heuristic for k-median and facility location problems , 2001, STOC '01.

[14]  Laurence A. Wolsey,et al.  An analysis of the greedy algorithm for the submodular set covering problem , 1982, Comb..

[15]  J. Ian Munro,et al.  Selection and sorting with limited storage , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[16]  Sudipto Guha,et al.  Approximating a finite metric by a small number of tree metrics , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[17]  Sariel Har-Peled,et al.  On coresets for k-means and k-median clustering , 2004, STOC '04.

[18]  Artur Czumaj,et al.  Estimating the weight of metric minimum spanning trees in sublinear-time , 2004, STOC '04.

[19]  Bernard Chazelle,et al.  Approximating the Minimum Spanning Tree Weight in Sublinear Time , 2001, ICALP.

[20]  Kamesh Munagala,et al.  Local Search Heuristics for k-Median and Facility Location Problems , 2004, SIAM J. Comput..