Edit Distance to Monotonicity in Sliding Windows

Given a stream of items each associated with a numerical value, its edit distance to monotonicity is the minimum number of items to remove so that the remaining items are non-decreasing with respect to the numerical value. The space complexity of estimating the edit distance to monotonicity of a data stream is becoming well-understood over the past few years. Motivated by applications on network quality monitoring, we extend the study to estimating the edit distance to monotonicity of a sliding window covering the w most recent items in the stream for any w≥1. We give a deterministic algorithm which can return an estimate within a factor of (4+e) using $O(\frac{1}{\epsilon ^2} \log^2(\epsilon w))$ space. We also extend the study in two directions. First, we consider a stream where each item is associated with a value from a partial ordered set. We give a randomized (4+e)-approximate algorithm using $O(\frac{1}{\epsilon^2} \log \epsilon^2 w \log w)$ space. Second, we consider an out-of-order stream where each item is associated with a creation time and a numerical value, and items may be out of order with respect to their creation times. The goal is to estimate the edit distance to monotonicity with respect to the numerical value of items arranged in the order of creation times. We show that any randomized constant-approximate algorithm requires linear space.

[1]  Ravi Kumar,et al.  Approximate counting of inversions in a data stream , 2002, STOC '02.

[2]  Funda Ergün,et al.  On distance to monotonicity and longest increasing subsequence of a data stream , 2008, SODA '08.

[3]  T. S. Jayram Hellinger Strikes Back: A Note on the Multi-party Information Complexity of AND , 2009, APPROX-RANDOM.

[4]  Farid M. Ablayev,et al.  Lower Bounds for One-Way Probabilistic Communication Complexity and Their Application to Space Complexity , 1996, Theor. Comput. Sci..

[5]  Graham Cormode,et al.  Time-decaying aggregates in out-of-order streams , 2008, PODS.

[6]  Anna Gál,et al.  Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Amit Chakrabarti,et al.  A note on randomized streaming space bounds for the longest increasing subsequence problem , 2012, Inf. Process. Lett..

[9]  Derick Wood,et al.  A survey of adaptive sorting algorithms , 1992, CSUR.

[10]  E. Fischer,et al.  Detecting and exploiting near-sortedness for efficient relational query evaluation , 2011, ICDT '11.

[11]  Graham Cormode,et al.  Permutation Editing and Matching via Embeddings , 2001, ICALP.

[12]  Hongjun Lu,et al.  Continuously maintaining quantile summaries of the most recent N elements over a data stream , 2004, Proceedings. 20th International Conference on Data Engineering.

[13]  Robert Krauthgamer,et al.  Estimating the sortedness of a data stream , 2007, SODA '07.