Finding witnesses by peeling

In the k-matches problem, we are given a pattern and a text, and for each text location, the desired output consists of all aligned matching characters if there are k or fewer of them, and any k aligned matching characters if there are more than k of them. This problem is one of several string matching problems that seek not only to find where the pattern matches the text under different “match” definitions, but also to provide witnesses to the match. Other such problems include k-aligned ones, k-witnesses, and k-mismatches. In addition, the solutions to several other string matching problems rely on the efficient solutions of the witness finding problems. In this article we provide a general method for solving such witness finding problems efficiently. We do so by casting the problem as a generalization of group testing, which we then solve by a process we call peeling. Using this general framework we obtain improved results for all of the problems mentioned. We also show that our method also solves a couple of problems outside the pattern matching domain.

[1]  Noga Alon,et al.  Derandomization, witnesses for Boolean matrix multiplication and construction of perfect hash functions , 1994, Algorithmica.

[2]  Andrea E. F. Clementi,et al.  Selective families, superimposed codes, and broadcasting on unknown radio networks , 2001, SODA '01.

[3]  Raphaël Clifford,et al.  Simple deterministic wildcard matching , 2007, Inf. Process. Lett..

[4]  D. Du,et al.  Combinatorial Group Testing and Its Applications , 1993 .

[5]  Karl R. Abrahamson Generalized String Matching , 1987, SIAM J. Comput..

[6]  Gad M. Landau,et al.  Efficient String Matching with k Mismatches , 2018, Theor. Comput. Sci..

[7]  Gad M. Landau,et al.  Parallel (pram erew) algorithms for contour-based 2D shape recognition , 1991, Pattern Recognit..

[8]  Peter Damaschke Randomized Group Testing for Mutually Obscuring Defectives , 1998, Inf. Process. Lett..

[9]  Piotr Indyk,et al.  Explicit constructions of selectors and related combinatorial structures, with applications , 2002, SODA '02.

[10]  Raimund Seidel,et al.  On the All-Pairs-Shortest-Path Problem in Unweighted Undirected Graphs , 1995, J. Comput. Syst. Sci..

[11]  JANOS KOMLGS,et al.  An Asymptotically Nonadaptive Algorithm for Conflict Resolution in Multiple-Access Channels , 1985 .

[12]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[13]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[14]  Piotr Indyk,et al.  Interpolation of symmetric functions and a new type of combinatorial design , 1999, STOC '99.

[15]  Amihood Amir,et al.  Efficient 2-dimensional approximate matching of non-rectangular figures , 1991, SODA '91.

[16]  Z Galil,et al.  Improved string matching with k mismatches , 1986, SIGA.

[17]  Wojciech Rytter,et al.  Deterministic broadcasting in ad hoc radio networks , 2002, Distributed Computing.

[18]  Amnon Ta-Shma,et al.  Loss-less condensers, unbalanced expanders, and extractors , 2001, STOC '01.

[19]  S. Muthukrishnan,et al.  New Results and Open Problems Related to Non-Standard Stringology , 1995, CPM.

[20]  Moshe Lewenstein,et al.  Faster algorithms for string matching with k mismatches , 2000, SODA '00.

[21]  Eli Gafni,et al.  An Information Theoretic Lower Bound for Broadcasting in Radio Networks , 2004, STACS.

[22]  Noga Alon,et al.  Witnesses for Boolean matrix multiplication and for shortest paths , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[23]  M. Fischer,et al.  STRING-MATCHING AND OTHER PRODUCTS , 1974 .

[24]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[25]  János Komlós,et al.  An asymptotically fast nonadaptive algorithm for conflict resolution in multiple-access channels , 1985, IEEE Trans. Inf. Theory.