Forest Alignment with Affine Gaps and Anchors

We present two enhancements to Jiang's tree alignment algorithm, motivated by experience with its use for RNA structure alignment. One enhancement is the introduction of an affine gap model, which can be accommodated with a runtime increase by a constant factor. The second enhancement is a speed-up of the alignment algorithm when certain nodes in the trees are pre-aligned by a so-called anchoring. Both enhancements are included in a new implementation of the tool RNAforester. We also argue that tree alignment should be parameterized by a user-described set of edit operations, generalizing over the traditional, atomic edit operations.

[1]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[2]  Gad M. Landau,et al.  Fast RNA Structure Alignment for Crossing Input Structures , 2009, CPM.

[3]  Rolf Backofen,et al.  Fixed Parameter Tractable Alignment of RNA Structures Including Arbitrary Pseudoknots , 2008, CPM.

[4]  Robert Giegerich,et al.  Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction , 2005, Bioinform..

[5]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[6]  Robert Giegerich,et al.  Pure multiple RNA secondary structure alignments: a progressive profile approach , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Francesc Rosselló,et al.  An algebraic view of the relation between largest common subtrees and smallest common supertrees , 2006, Theor. Comput. Sci..

[8]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[9]  Hélène Touzet,et al.  How to Compare Arc-Annotated Sequences: The Alignment Hierarchy , 2006, SPIRE.

[10]  Ron Y. Pinter,et al.  Seeded Tree Alignment , 2008, IEEE ACM Trans. Comput. Biol. Bioinform..

[11]  Robert Giegerich,et al.  Semantics and Ambiguity of Stochastic RNA Family Models , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[13]  Hélène Touzet,et al.  Tree edit distance with gaps , 2003, Inf. Process. Lett..

[14]  Tao Jiang,et al.  Alignment of Trees - An Alternative to Tree Edit , 1994, Theor. Comput. Sci..

[15]  William Ritchie,et al.  RNA stem-loops: to be or not to be cleaved by RNAse III. , 2007, RNA.

[16]  Hélène Touzet,et al.  A Linear Tree Edit Distance Algorithm for Similar Ordered Trees , 2005, CPM.

[17]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[18]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[19]  Robert Giegerich,et al.  Fine-tuning structural RNA alignments in the twilight zone , 2010, BMC Bioinformatics.

[20]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[21]  Stefanie Schirmer Comparing forests , 2012 .