Parameterized Matching in the Streaming Model

We study the problem of parameterized matching in a stream where we want to output matches between a pattern of length m and the last m symbols of the stream before the next symbol arrives. Parameterized matching is a natural generalisation of exact matching where an arbitrary one-to-one relabelling of pattern symbols is allowed. We show how this problem can be solved in constant time per arriving stream symbol and sublinear, near optimal space with high probability. Our results are surprising and important: it has been shown that almost no streaming pattern matching problems can be solved (not even randomised) in less than Theta(m) space, with exact matching as the only known problem to have a sublinear, near optimal space solution. Here we demonstrate that a similar sublinear, near optimal space solution is achievable for an even more challenging problem. The proof is considerably more complex than that for exact matching.

[1]  Mikkel Thorup,et al.  Tight(er) worst-case bounds on dynamic searching and priority queues , 2000, STOC '00.

[2]  Raphaël Clifford,et al.  Pattern matching in pseudo real-time , 2011, J. Discrete Algorithms.

[3]  Brenda S. Baker,et al.  Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance , 1997, SIAM J. Comput..

[4]  Brenda S. Baker Parameterized Pattern Matching: Algorithms and Applications , 1996, J. Comput. Syst. Sci..

[5]  Ely Porat,et al.  Exact and Approximate Pattern Matching in the Streaming Model , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[6]  Funda Ergün,et al.  Periodicity in Streams , 2010, APPROX-RANDOM.

[7]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[8]  Moshe Lewenstein,et al.  Approximate Parameterized Matching , 2004, ESA.

[9]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[10]  Ely Porat,et al.  A Black Box for Online Approximate Pattern Matching , 2008, CPM.

[11]  Moni Naor,et al.  De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results , 2009, ICALP.

[12]  Zvi Galil,et al.  String Matching in Real Time , 1981, JACM.

[13]  S. Muthukrishnan,et al.  Alphabet Dependence in Parameterized Matching , 1994, Inf. Process. Lett..

[14]  Ely Porat,et al.  Space Lower Bounds for Online Pattern Matching , 2011, CPM.

[15]  Brenda S. Baker,et al.  A theory of parameterized pattern matching: algorithms and applications , 1993, STOC.

[16]  Zvi Galil,et al.  Real-Time Streaming String-Matching , 2014, TALG.

[17]  Brenda S. Baker Parameterized pattern matching by Boyer-Moore-type algorithms , 1995, SODA '95.

[18]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.