Pattern Matching in Text Compressed by Using Antidictionaries

In this paper we focus on the problem of compressed pattern matching for the text compression using antidictionaries, which is a new compression scheme proposed recently by Crochemore et al. (1998). We show an algorithm which preprocesses a pattern of length m and an antidictionary M in O(m2 + ||M||) time, and then scans a compressed text of length n in O(n + r) time to find all pattern occurrences, where ||M|| is the total length of strings in M and r is the number of the pattern occurrences.

[1]  Ayumi Shinohara,et al.  An Improved Pattern Matching Algorithm for Strings in Terms of Straight-Line Programs , 1997, CPM.

[2]  Uzi Vishkin,et al.  Matching Patterns in Strings Subject to Multi-Linear Transformations , 1988, Theor. Comput. Sci..

[3]  Wojciech Plandowski,et al.  Efficient Algorithms for Lempel-Zip Encoding (Extended Abstract) , 1996, SWAT.

[4]  Gary Benson,et al.  Optimal Two-Dimensional Compressed Matching , 1994, J. Algorithms.

[5]  Antonio Restivo,et al.  Minimal Forbidden Words and Factor Automata , 1998, MFCS.

[6]  Ayumi Shinohara,et al.  Shift-And Approach to Pattern Matching in LZW Compressed Text , 1999, CPM.

[7]  Udi Manber,et al.  A text compression scheme that allows fast searching directly in the compressed file , 1994, TOIS.

[8]  Ayumi Shinohara,et al.  Multiple pattern matching in LZW compressed text , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[9]  Gad M. Landau,et al.  Efficient pattern matching with scaling , 1990, SODA '90.

[10]  A. Restivo,et al.  Text Compression Using Antidictionaries , 1999, ICALP.

[11]  Masayuki Takeda,et al.  Pattern Matching Machine for Text Compressed Using Finite State Model , 1997 .

[12]  Wojciech Plandowski,et al.  Eecient Algorithms for Lempel-ziv Encoding , 1996 .

[13]  Gary Benson,et al.  Two-dimensional periodicity and its applications , 1992, SODA '92.

[14]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[15]  Gary Benson,et al.  Let sleeping files lie: pattern matching in Z-compressed files , 1994, SODA '94.

[16]  Wojciech Plandowski,et al.  Efficient algorithms for Lempel-Ziv encoding , 1996 .

[17]  Ricardo A. Baeza-Yates,et al.  Fast searching on compressed text allowing errors , 1998, SIGIR '98.

[18]  Mikkel Thorup,et al.  String Matching in Lempel—Ziv Compressed Strings , 1998, Algorithmica.

[19]  Wojciech Rytter,et al.  An Efficient Pattern-Matching Algorithm for Strings with Short Descriptions , 1997, Nord. J. Comput..

[20]  Gary Benson,et al.  Efficient two-dimensional compressed matching , 1992, Data Compression Conference, 1992..

[21]  Ricardo A. Baeza-Yates,et al.  Direct pattern matching on compressed text , 1998, Proceedings. String Processing and Information Retrieval: A South American Symposium (Cat. No.98EX207).

[22]  S. Arikawa,et al.  Byte Pair Encoding: a Text Compression Scheme That Accelerates Pattern Matching , 1999 .