Parameterized matching on non-linear structures

The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set @S. In the parameterized pattern matching model, a consistent renaming of symbols from @S is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In classical pattern matching, both the text and pattern are strings. Applications such as searching in xml or searching in hypertext require searching strings in non-linear structures such as trees or graphs. There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(nlogm) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete.

[1]  Brenda S. Baker,et al.  Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance , 1997, SIAM J. Comput..

[2]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[3]  Moshe Lewenstein,et al.  Parameterized matching with mismatches , 2007, J. Discrete Algorithms.

[4]  Tatsuya Akutsu A Linear Time Pattern Matching Algorithm Between a String and a Tree , 1993, CPM.

[5]  Moshe Lewenstein,et al.  Approximate Parameterized Matching , 2004, ESA.

[6]  Moshe Lewenstein,et al.  Pattern Matching in Hypertext , 1997, J. Algorithms.

[7]  Gary Benson,et al.  An Alphabet Independent Approach to Two-Dimensional Pattern Matching , 1994, SIAM J. Comput..

[8]  Alejandro A. Schäffer,et al.  Multiple Matching of Parameterized Patterns , 1994, CPM.

[9]  Wojciech Rytter,et al.  Text Algorithms , 1994 .

[10]  Brenda S. Baker Parameterized Pattern Matching: Algorithms and Applications , 1996, J. Comput. Syst. Sci..

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  M. Fischer,et al.  STRING-MATCHING AND OTHER PRODUCTS , 1974 .

[13]  Moshe Lewenstein,et al.  Function Matching , 2006, SIAM J. Comput..

[14]  Udi Manber,et al.  APPROXIMATE STRING MATCHING WITH ARBITRARY COSTS FOR TEXT AND HYPERTEXT , 1993 .

[15]  Alfred Burton Marsh Matching algorithms. , 1979 .

[16]  Kenneth Ward Church,et al.  Separable attributes: a technique for solving the sub matrices character count problem , 2002, SODA '02.

[17]  Mohan S. Kankanhalli,et al.  Color indexing for efficient image retrieval , 1995, Multimedia Tools and Applications.

[18]  Z. Galil,et al.  Pattern matching algorithms , 1997 .

[19]  Dong Kyue Kim,et al.  String Matching in Hypertext , 1995, CPM.

[20]  Gonzalo Navarro,et al.  Improved approximate pattern matching on hypertext , 1998, Theor. Comput. Sci..

[21]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[22]  S. Muthukrishnan,et al.  Alphabet Dependence in Parameterized Matching , 1994, Inf. Process. Lett..