Succinct Approximate Rank Queries

We consider the problem of summarizing a multi set of elements in $\{1, 2, \ldots , n\}$ under the constraint that no element appears more than $\ell$ times. The goal is then to answer \emph{rank} queries --- given $i\in\{1, 2, \ldots , n\}$, how many elements in the multi set are smaller than $i$? --- with an additive error of at most $\Delta$ and in constant time. For this problem, we prove a lower bound of $\mathcal B_{\ell,n,\Delta}\triangleq$ $\left\lfloor{\frac{n}{\left\lceil{\Delta / \ell}\right\rceil}}\right\rfloor $ $\log\big({\max\{\left\lfloor{\ell / \Delta}\right\rfloor,1\} + 1}\big)$ bits and provide a \emph{succinct} construction that uses $\mathcal B_{\ell,n,\Delta}(1+o(1))$ bits. Next, we generalize our data structure to support processing of a stream of integers in $\{0,1,\ldots,\ell\}$, where upon a query for some $i\le n$ we provide a $\Delta$-additive approximation for the sum of the \emph{last} $i$ elements. We show that this too can be done using $\mathcal B_{\ell,n,\Delta}(1+o(1))$ bits and in constant time. This yields the first sub linear space algorithm that computes approximate sliding window sums in $O(1)$ time, where the window size is given at the query time; additionally, it requires only $(1+o(1))$ more space than is needed for a fixed window size.

[1]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[2]  Robert E. Tarjan,et al.  Storing a sparse table , 1979, CACM.

[3]  C. SIAMJ. LOW REDUNDANCY IN STATIC DICTIONARIES WITH CONSTANT QUERY TIME , 2001 .

[4]  Roy Friedman,et al.  Efficient Summing over Sliding Windows , 2016, ArXiv.

[5]  Moni Naor,et al.  Sliding Bloom Filters , 2013, ISAAC.

[6]  Frédéric Giroire,et al.  Estimating the Number of Active Flows in a Data Stream over a Sliding Window , 2007, ANALCO.

[7]  J. Ian Munro,et al.  Membership in Constant Time and Almost-Minimum Space , 1999, SIAM J. Comput..

[8]  Roy Friedman,et al.  Heavy hitters in streams and sliding windows , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[9]  Yong Guan,et al.  Near-optimal approximate membership query over time-decaying windows , 2013, 2013 Proceedings IEEE INFOCOM.

[10]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[11]  Kyriakos Mouratidis,et al.  Continuous monitoring of top-k queries over sliding windows , 2006, SIGMOD Conference.

[12]  D. James Continuous monitoring. , 1979, The Urologic clinics of North America.

[13]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .

[14]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[15]  Roy Friedman,et al.  Poster abstract: A sliding counting bloom filter , 2017, 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[16]  Haixun Wang,et al.  Efficiently Monitoring Top-k Pairs over Sliding Windows , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[17]  Edith Cohen,et al.  Maintaining time-decaying stream aggregates , 2003, J. Algorithms.

[18]  David Richard Clark,et al.  Compact pat trees , 1998 .

[19]  Srikanta Tirthapura,et al.  Distributed Streams Algorithms for Sliding Windows , 2004, Theory of Computing Systems.

[20]  S. Srinivasa Rao,et al.  Static Dictionaries Supporting Rank , 1999, ISAAC.

[21]  Andrew McGregor,et al.  Dynamic Graphs in the Sliding-Window Model , 2013, ESA.

[22]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets , 2007, ACM Trans. Algorithms.