The Noisy Subsequence Tree Recognition Problem

In this paper we consider the problem of recognizing ordered labeled trees by processing their noisy subsequence-trees which are “patched-up” noisy portions of their fragments. We assume that we are given H, a finite dictionary of ordered labeled trees. X* is an unknown element of H, and U is any arbitrary subsequence-tree of X*. We consider the problem of estimating X* by processing Y — a noisy version of U. We do this by sequentially comparing Y with every element X of H, the basis of comparison being the constrained edit distance between two trees [OL94], where the constraint implicitly captures the properties of the corrupting mechanism (“channel”) which noisily garbles U into Y. Experimental results which involve manually constructed trees of sizes between 25 and 35 nodes and which contain an average of 21.8 errors per tree demonstrate that the scheme has about 92.8% accuracy. Similar experiments for randomly generated trees yielded an accuracy of 86.4%. To our knowledge this is the first reported solution to the problem.