Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism

In this paper, we consider the unordered pseudo-tree matching problem, which is a problem of, given two unordered labeled trees P and T, finding all occurrences of P in T via such many-to-one matchings that preserve node labels and parent-child relationship. This problem is closely related to the tree pattern matching problem for XPath queries with child axis only. If m>w, we present an efficient algorithm that solves the problem in O(nmlog(w)/w) time using O(hm/w+mlog(w)/w) space and O(mlog(w)) preprocessing on a unit-cost arithmetic RAM model with addition, where m is the number of nodes in P, n is the number of nodes in T, h is the height of T, and w is the word length, and we assume that w>=logn. We also discuss a modification of our algorithm for the unordered tree homeomorphism problem, which corresponds to the tree pattern matching problem for XPath queries with descendant axis only.

[1]  Sebastiano Vigna,et al.  Broadword Implementation of Rank/Select Queries , 2008, WEA.

[2]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[3]  Philip Bille,et al.  The tree inclusion problem: In linear space and faster , 2011, TALG.

[4]  Philip Bille,et al.  New Algorithms for Regular Expression Matching , 2006, ICALP.

[5]  Tim Furche,et al.  Evaluating Complex Queries Against XML Streams with Polynomial Combined Complexity , 2004, BNCOD.

[6]  Heikki Mannila,et al.  Ordered and Unordered Tree Inclusion , 1995, SIAM J. Comput..

[7]  S. Rao Kosaraju,et al.  Efficient tree pattern matching , 1989, 30th Annual Symposium on Foundations of Computer Science.

[8]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[9]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.

[10]  C. Jordan Sur les assemblages de lignes. , 1869 .

[11]  Georg Gottlob,et al.  Conjunctive queries over trees , 2006, J. ACM.

[12]  Masayuki Takeda,et al.  A Bit-Parallel Tree Matching Algorithm for Patterns with Horizontal VLDC's , 2005, SPIRE.

[13]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[14]  Wim Martens,et al.  Efficient algorithms for descendant-only tree pattern queries , 2009, Inf. Syst..

[15]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[16]  Thomas Schwentick,et al.  Expressiveness and complexity of XML Schema , 2006, TODS.

[17]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[18]  Georg Gottlob,et al.  Monadic queries over tree-structured data , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[19]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[20]  Henry S. Warren Functions realizable with word-parallel logical and two's-complement addition instructions , 1977, CACM.

[21]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[22]  Leah Epstein,et al.  On the online unit clustering problem , 2007, TALG.

[23]  Philip Bille,et al.  The Tree Inclusion Problem: In Optimal Space and Faster , 2005, ICALP.

[24]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Tuples and Permutations (Art of Computer Programming) , 2005 .

[25]  Pekka Kilpeläinen,et al.  Tree Matching Problems with Applications to Structured Text Databases , 2022 .

[26]  Hiroaki Yamamoto,et al.  Bit-Parallel Tree Pattern Matching Algorithms for Unordered Labeled Trees , 2009, WADS.

[27]  Gabriel Valiente Constrained tree inclusion , 2005, J. Discrete Algorithms.

[28]  Gaston H. Gonnet,et al.  A new approach to text searching , 1992, CACM.

[29]  Eugene W. Myers,et al.  A Four Russians algorithm for regular expression pattern matching , 1992, JACM.

[30]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[31]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[32]  Thomas Schwentick,et al.  Expressive and efficient pattern languages for tree-structured data (extended abstract) , 2000, PODS '00.

[33]  Rajeev Raman,et al.  Sorting in Linear Time? , 1998, J. Comput. Syst. Sci..

[34]  J W Ballard,et al.  Data on the web? , 1995, Science.

[35]  Gonzalo Navarro,et al.  Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences , 2002 .

[36]  Hiroki Arimura,et al.  Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism , 2010, J. Discrete Algorithms.

[37]  Weimin Chen,et al.  More Efficient Algorithm for Ordered Tree Inclusion , 1998, J. Algorithms.

[38]  Hiroki Arimura,et al.  Fast Bit-Parallel Matching for Network and Regular Expressions , 2010, SPIRE.