New dissimilarity measure for recognizing noisy subsequence trees

Tree is a data structure used to express various objects such as semistructured data and genes. When objects are represented as trees, computing tree similarity is essential for pattern recognition and retrieval. This paper considers the noisy subsequence tree recognition problem whose purpose is to recognize the original tree, given its noisy subsequence tree. Previous research on this problem relied on constrained tree edit distance to measure the dissimilarity. However, the number of relabelings must be predetermined to compute it.

[1]  B. John Oommen,et al.  On the Pattern Recognition of Noisy Subsequence Trees , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mike A. Steel,et al.  Metrics on RNA Secondary Structures , 2000, J. Comput. Biol..

[3]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[4]  Sung-Bae Cho,et al.  An efficient algorithm to compute differences between structured documents , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Kaizhong Zhang,et al.  Fast Algorithms for the Unit Cost Editing Distance Between Trees , 1990, J. Algorithms.

[6]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[7]  Heikki Mannila,et al.  Ordered and Unordered Tree Inclusion , 1995, SIAM J. Comput..

[8]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[9]  Felix Naumann,et al.  Approximate tree embedding for querying XML data , 2000 .

[10]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[11]  Philip N. Klein,et al.  Computing the Edit-Distance between Unrooted Ordered Trees , 1998, ESA.

[12]  Erik D. Demaine,et al.  An optimal decomposition algorithm for tree edit distance , 2006, TALG.

[13]  B. John Oommen,et al.  A formal theory for optimal and information theoretic syntactic pattern recognition , 1998, Pattern Recognit..

[14]  Alberto H. F. Laender,et al.  Automatic web news extraction using tree edit distance , 2004, WWW '04.

[15]  H. V. Jagadish,et al.  Evaluating Structural Similarity in XML Documents , 2002, WebDB.

[16]  Rafael Berlanga Llavori,et al.  Approximate Subtree Identification in Heterogeneous XML Documents Collections , 2005, XSym.

[17]  Kaizhong Zhang,et al.  On the Editing Distance Between Unordered Labeled Trees , 1992, Inf. Process. Lett..

[18]  Ron Y. Pinter,et al.  Approximate labelled subtree homeomorphism , 2004, J. Discrete Algorithms.