Space lower bounds for online pattern matching

We present space lower bounds for online pattern matching under a number of different distance measures. Given a pattern of length m and a text that arrives one character at a time, the online pattern matching problem is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. We require that the correct answer is given at each position with constant probability. We give @W(m) bit space lower bounds for L"1, L"2, L"~, Hamming, edit and swap distances as well as for any algorithm that computes the cross-correlation/convolution. We then show a dichotomy between distance functions that have wildcard-like properties and those that do not. In the former case which includes, as an example, pattern matching with character classes, we give @W(m) bit space lower bounds. For other distance functions, we show that there exist space bounds of @W(logm) and O(log^2m) bits. Finally we discuss space lower bounds for non-binary inputs and show how in some cases they can be improved.

[1]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[2]  Robert Krauthgamer,et al.  Approximating edit distance efficiently , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[3]  Ravi Kumar,et al.  The One-Way Communication Complexity of Hamming Distance , 2008, Theory Comput..

[4]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[5]  Gad M. Landau,et al.  Pattern matching with swaps , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[6]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[7]  Shengyu Zhang,et al.  The communication complexity of the Hamming distance problem , 2006, Inf. Process. Lett..

[8]  S. Muthukrishnan,et al.  String Matching Under a General Matching Relation , 1995, Inf. Comput..

[9]  Ilan Newman,et al.  Private vs. Common Random Bits in Communication Complexity , 1991, Inf. Process. Lett..

[10]  Oded Goldreich,et al.  Unbiased bits from sources of weak randomness and probabilistic communication complexity , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[11]  Ely Porat,et al.  Exact and Approximate Pattern Matching in the Streaming Model , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[12]  Ron Shamir,et al.  Faster pattern matching with character classes using prime number encoding , 2009, J. Comput. Syst. Sci..

[13]  Oded Goldreich,et al.  Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity , 1988, SIAM J. Comput..