Dynamic Dictionary Matching in External Memory

In the dynamic dictionary matching problem, a dictionary D contains a set of patterns that can change over time under insertion and deletion of individual patterns. Given an arbitrary text T, we must efficiently list all the dictionary patterns that occur at each text position. We investigate the IO complexity of this problem for a large dictionary that must be stored in external storage devices. By following a completely new approach, we devise an efficient solution which is based upon the SB-tree data structure (P. Ferragina and R. Grossi, 1995, in ‘‘Proc. ACM Symposium on Theory of Computing,’’ pp. 693702), and a novel notion of certificate for the dictionary matching problem. Our data structure can be adapted to efficiently work in main memory and to solve other problems, thus providing a new insight into the nature of the dictionary matching problem. ] 1998 Academic Press

[1]  Brenda S. Baker Parameterized pattern matching by Boyer-Moore-type algorithms , 1995, SODA '95.

[2]  Roberto Grossi,et al.  A fully-dynamic data structure for external substring search , 1995, STOC '95.

[3]  Dany Breslauer Dictionary-Matching on Unbounded Alphabets: Uniform Length Dictionaries , 1995, J. Algorithms.

[4]  Jeffrey O. Kephart,et al.  Biologically Inspired Defenses Against Computer Viruses , 1995, IJCAI.

[5]  Uzi Vishkin,et al.  Efficient approximate and dynamic matching of patterns using a labeling paradigm , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[6]  S. Muthukrishnan,et al.  Alphabet Dependence in Parameterized Matching , 1994, Inf. Process. Lett..

[7]  Alejandro A. Schäffer,et al.  Dynamic Dictionary Matching with Failure Functions , 1994, Theor. Comput. Sci..

[8]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[9]  Brenda S. Baker,et al.  A theory of parameterized pattern matching: algorithms and applications , 1993, STOC.

[10]  Amihood Amir,et al.  Adaptive dictionary matching , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[11]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[12]  S. Rao Kosaraju Faster Algorithms for the Construction of Parameterized Suffix Trees (Preliminary Version) , 1995, FOCS.

[13]  Alejandro A. Schäffer,et al.  Multiple Matching of Parametrized Patterns , 1996, Theor. Comput. Sci..

[14]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[15]  Roberto Grossi,et al.  Fast string searching in secondary storage: theoretical developments and experimental results , 1996, SODA '96.

[16]  Brenda S. Baker Parameterized Pattern Matching: Algorithms and Applications , 1996, J. Comput. Syst. Sci..

[17]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[18]  Karen A. Frenkel,et al.  The human genome project and informatics , 1991, CACM.

[19]  Raffaele Giancarlo,et al.  Dynamic Dictionary Matching , 1994, J. Comput. Syst. Sci..

[20]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[21]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[22]  Alejandro A. Schäffer,et al.  Improved dynamic dictionary matching , 1995, SODA '93.

[23]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.