A Tree Distance Function Based on Multi-sets

We introduce a tree distance function based on multi-sets. We show that this function is a metric on tree spaces, and we design an algorithm to compute the distance between trees of size at most n in O (n 2) time and O (n ) space. Contrary to other tree distance functions that require expensive memory allocations to maintain dynamic programming tables of forests, our function can be implemented over simple and static structures. Additionally, we present a case study in which we compare our function with other two distance functions.

[1]  Anthony K. H. Tung,et al.  Similarity evaluation on tree-structured data , 2005, SIGMOD '05.

[2]  Kaizhong Zhang,et al.  On the Editing Distance Between Unordered Labeled Trees , 1992, Inf. Process. Lett..

[3]  Takeshi Shinohara,et al.  On Dimension Reduction Mappings for Approximate Retrieval of Multi-dimensional Data , 1999, Progress in Discovery Science.

[4]  Setsuo Arikawa,et al.  Progress in Discovery Science , 2002, Lecture Notes in Computer Science.

[5]  Jan Chomicki,et al.  Hippo: A System for Computing Consistent Answers to a Class of SQL Queries , 2004, EDBT.

[6]  Kaizhong Zhang Computing similarity between RNA secondary structures , 1998, Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174).

[7]  Jennifer Widom,et al.  Change detection in hierarchically structured information , 1996, SIGMOD '96.

[8]  Takeshi Shinohara,et al.  Fast Approximate Matching of Programs for Protecting Libre/Open Source Software by Using Spatial Indexes , 2007 .

[9]  Kouichi Hirata,et al.  The q-Gram Distance for Ordered Unlabeled Trees , 2005, Discovery Science.

[10]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[11]  Philip N. Klein,et al.  A tree-edit-distance algorithm for comparing simple, closed shapes , 2000, SODA '00.

[12]  Hector Garcia-Molina,et al.  Meaningful change detection in structured data , 1997, SIGMOD '97.

[13]  Philip N. Klein,et al.  Computing the Edit-Distance between Unrooted Ordered Trees , 1998, ESA.

[14]  Hans-Peter Kriegel,et al.  Efficient Similarity Search for Hierarchical Data in Large Databases , 2004, EDBT.

[15]  Kaizhong Zhang,et al.  Algorithms for the constrained editing distance between ordered labeled trees and related problems , 1995, Pattern Recognit..

[16]  G. Italiano,et al.  Algorit[h]ms - ESA '98 : 6th Annual European Symposium, Venice, Italy, August 24-26, 1998 : proceedings , 1998 .

[17]  Takeshi Shinohara,et al.  On approximate matching of programs for protecting libre software , 2006, CASCON.

[18]  Michael H. Böhlen,et al.  Approximate Matching of Hierarchical Data Using pq-Grams , 2005, VLDB.

[19]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[20]  Lusheng Wang,et al.  Alignment of trees: an alternative to tree edit , 1995 .

[21]  Amit Kumar,et al.  XML stream processing using tree-edit distance embeddings , 2005, TODS.

[22]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[23]  Erik D. Demaine,et al.  An optimal decomposition algorithm for tree edit distance , 2006, TALG.