论文信息 - The Noisy Subsequence Tree Recognition Problem

The Noisy Subsequence Tree Recognition Problem

In this paper we consider the problem of recognizing ordered labeled trees by processing their noisy subsequence-trees which are “patched-up” noisy portions of their fragments. We assume that we are given H, a finite dictionary of ordered labeled trees. X* is an unknown element of H, and U is any arbitrary subsequence-tree of X*. We consider the problem of estimating X* by processing Y — a noisy version of U. We do this by sequentially comparing Y with every element X of H, the basis of comparison being the constrained edit distance between two trees [OL94], where the constraint implicitly captures the properties of the corrupting mechanism (“channel”) which noisily garbles U into Y. Experimental results which involve manually constructed trees of sizes between 25 and 35 nodes and which contain an average of 21.8 errors per tree demonstrate that the scheme has about 92.8% accuracy. Similar experiments for randomly generated trees yielded an accuracy of 86.4%. To our knowledge this is the first reported solution to the problem.

B. John Oommen | Richard K. S. Loke

[1] B. John Oommen,et al. A formal theory for optimal and information theoretic syntactic pattern recognition , 1998, Pattern Recognit..

[2] Kaizhong Zhang,et al. Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[3] B. John Oommen,et al. Constrained Tree Editing , 1994, Inf. Sci..

[4] Kaizhong Zhang,et al. Fast Serial and Parallel Algorithms for Approximate Tree Matching with VLDC's , 1992, CPM.

[5] Shin-Yee Lu. A Tree-to-Tree Distance and Its Application to Cluster Analysis , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Tao Jiang,et al. Some MAX SNP-Hard Results Concerning Unordered Labeled Trees , 1994, Inf. Process. Lett..

[7] B. John Oommen. Recognition of Noisy Subsequences Using Constrained Edit Distances , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] David Sankoff,et al. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[9] Kuo-Chung Tai,et al. The Tree-to-Tree Correction Problem , 1979, JACM.

[10] Stanley M. Selkow,et al. The Tree-to-Tree Editing Problem , 1977, Inf. Process. Lett..

[11] Kaizhong Zhang,et al. Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..