Usefulness of the Karp-Miller-Rosenberg Algorithm in Parallel Computations on Strings and Arrays

Abstract The Karp-Miller-Rosenberg (1972) algorithm was one of the first efficient (almost linear) sequential algorithms for finding repeated patterns and for string matching. In the area of efficient sequential computations on strings it was soon superseded by more efficient (and more sophisticated) algorithms. We show that the Karp-Miller-Rosenberg algorithm (KMR) must be considered as a basic technique in parallel computations. For many problems, variations of KMR give the (known) most efficient parallel algorithms. The representation of the set of basic factors (subarrays) of a string (array) produced by the algorithm is an extremely useful data structure in parallel algorithms on strings and arrays. This gives also a general unifying framework for a large variety of problems. We show that the following problems for strings and arrays can be solved by almost optimal parallel algorithms: pattern-matching, longest repeated factor (subarray), longest common factor (subarray), maximal symmetric factor (subarray). Also the following problems for strings can be solved within the same complexity bounds: finding squares, testing even palstars and compositions of k palindromes for k =2, 3, 4, computing Lyndon factorization and building minimal pattern-matching automata. In the model without concurrent writes the parallel time is O ( log ( n ) 2 ) (with n processors) and in the model with concurrent writes the time, for most of the problems, is O ( log ( n )) (with n processors). For two problems related to the one-dimensional case (longest repeated factor and longest common factor) there were designed parallel algorithms using suffix trees (Apostolico et al. 1988). However, our data structure is simpler and, furthermore, for the two-dimensional case suffix trees do not work. The complexity of our algorithms does not depend on the size of the alphabet, except for the computation of pattern-matching automata.

[1]  Wojciech Rytter,et al.  Efficient parallel algorithms , 1988 .

[2]  Maxime Crochemore,et al.  An Optimal Algorithm for Computing the Repetitions in a Word , 1981, Inf. Process. Lett..

[3]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[4]  Richard J. Lorentz,et al.  Linear Time Recognition of Squarefree Strings , 1985 .

[5]  R.S. Bird,et al.  Two Dimensional Pattern Matching , 1977, Inf. Process. Lett..

[6]  Maxime Crochemore,et al.  Transducers and Repetitions , 1986, Theor. Comput. Sci..

[7]  Jean Pierre Duval,et al.  Factorizing Words over an Ordered Alphabet , 1983, J. Algorithms.

[8]  Gad M. Landau,et al.  Parallel Construction of a Suffix Tree (Extended Abstract) , 1987, ICALP.

[9]  Wojciech Rytter A Note on Optimal Parallel Transformations of Regular Expressions to Nondeterministic Finite Automata , 1989, Inf. Process. Lett..

[10]  Zvi Galil,et al.  Open Problems in Stringology , 1985 .

[11]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[12]  Uzi Vishkin,et al.  Optimal Parallel Pattern Matching in Strings , 2017, Inf. Control..

[13]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[14]  Richard Cole,et al.  Parallel merge sort , 1988, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[15]  Zvi Galil,et al.  A Linear-Time On-Line Recognition Algorithm for ``Palstar'' , 1978, JACM.

[16]  Zvi Galil,et al.  Time-Space-Optimal String Matching , 1983, J. Comput. Syst. Sci..

[17]  Michael G. Main,et al.  An O(n log n) Algorithm for Finding All Repetitions in a String , 1984, J. Algorithms.

[18]  Maxime Crochemore,et al.  String-matching and periods , 1989, Bull. EATCS.

[19]  Maxime Crochemore,et al.  Longest Common Factor of Two Words , 1987, TAPSOFT, Vol.1.

[20]  Theodore P. Baker A Technique for Extending Rapid Exact-Match String Matching to Arrays of More Than One Dimension , 1978, SIAM J. Comput..

[21]  Arnold L. Rosenberg,et al.  Rapid identification of repeated patterns in strings, trees and arrays , 1972, STOC.

[22]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[23]  Glenn K. Manacher,et al.  A New Linear-Time ``On-Line'' Algorithm for Finding the Smallest Initial Palindrome of a String , 1975, JACM.

[24]  Alberto Apostolico,et al.  On Context Constrained Squares and Repetitions in a String , 1984, RAIRO Theor. Informatics Appl..

[25]  Zvi Galil Optimal Parallel Algorithms for String Matching , 1985, Inf. Control..