Parametric alignment of ordered trees

MOTIVATION Computing the similarity between two ordered trees has applications in RNA secondary structure comparison, genetics and chemical structure analysis. Alignment of tree is one of the proposed measures. Similar to pair-wise sequence comparison, there is often disagreement about how to weight matches, mismatches, indels and gaps when we compare two trees. For sequence comparison, the parametric sequence alignment tools have been developed. The users are allowed to see explicitly and completely the effect of parameter choices on the optimal sequence alignments. A similar tool for aligning two ordered trees is required in practice. RESULTS We develop a parametric tool for aligning two ordered trees that allow users to see the effect of parameter choices on the optimal alignment of trees. Our contributions include: (1) develop a parametric tool for aligning two ordered trees; (2) design an efficient algorithm for aligning two ordered trees with gap penalties that runs in O(n(2)deg(2)) time, where n is the number of nodes in the trees and deg is the degree of the trees; and (3) reduce the space of the algorithm from O(n(2)deg(2)) to O(n log n. deg(2)). AVAILABILITY The software is available at http://www.cs.cityu.edu.hk/~lwang/software/ParaTree

[1]  Bin Ma,et al.  Computing similarity between RNA structures , 1999, Theor. Comput. Sci..

[2]  Dan Gusfield,et al.  Parametric optimization of sequence alignment , 1992, SODA '92.

[3]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[4]  M S Waterman,et al.  Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[5]  Lusheng Wang,et al.  Alignment of trees: an alternative to tree edit , 1995 .

[6]  E. Lander,et al.  Parametric sequence comparisons. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R. Nussinov,et al.  Tree graphs of RNA secondary structures and their comparisons. , 1989, Computers and biomedical research, an international journal.

[8]  Kaizhong Zhang,et al.  A constrained edit distance between unordered labeled trees , 1996, Algorithmica.

[9]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[10]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[11]  Yoshimasa Takahashi,et al.  Recognition of Largest Common Structural Fragment among a Variety of Chemical Structures , 1987 .

[12]  Kaizhong Zhang,et al.  Identifying Approximately Common Substructures in Trees Based on a Restricted Edit Distance , 1999, Inf. Sci..

[13]  Frank Y. Shih Object representation and recognition using mathematical morphology model , 1991, J. Syst. Integr..

[14]  Thomas Lengauer,et al.  Fast and numerically stable parametric alignment of biosequences , 1997, RECOMB '97.

[15]  D Gusfield,et al.  Parametric and inverse-parametric sequence alignment with XPARAL. , 1996, Methods in enzymology.

[16]  Bin Ma,et al.  A General Edit Distance between RNA Structures , 2002, J. Comput. Biol..

[17]  Bruce A. Shapiro,et al.  An algorithm for comparing multiple RNA secondary structures , 1988, Comput. Appl. Biosci..