Computing similarity between RNA secondary structures

The primary structure of a ribonucleic acid (RNA) molecule is a sequence of nucleotides (bases) over the four-letter alphabet {A,C,G,U}. The secondary structure of an RNA is a set of base-pairs (nucleotide pairs) which formed bonds between A-U and C-G. These bonds have been traditional assumed to be non-crossing in the secondary structure. This implies a tree representation of the secondary structure of RNA molecule. This paper considers several notions of similarity between two RNA molecule structures taking into account both the primary and the secondary structures. We consider a natural tree representation with both primary and secondary structure data. We present efficient algorithms for comparing such tree representation. We then show that some of these similarity notions can be used to solve the structure prediction problem when the structure of a closely related RNA is known.

[1]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  Ruth Nussinov,et al.  RNA secondary structures: comparison and determination of frequently recurring substructures by consensus , 1989, Comput. Appl. Biosci..

[4]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[5]  Florence Corpet,et al.  RNAlign program: alignment of RNA sequences using both primary and secondary structures , 1994, Comput. Appl. Biosci..

[6]  M. Waterman,et al.  RNA secondary structure: a complete mathematical analysis , 1978 .

[7]  R. Nussinov,et al.  Tree graphs of RNA secondary structures and their comparisons. , 1989, Computers and biomedical research, an international journal.

[8]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[9]  Lusheng Wang,et al.  Alignment of trees: an alternative to tree edit , 1995 .

[10]  R. Ravi,et al.  Computing Similarity between RNA Strings , 1996, CPM.

[11]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[12]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[13]  Temple F. Smith,et al.  Comparison of biosequences , 1981 .

[14]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[15]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[16]  Bruce A. Shapiro,et al.  An algorithm for comparing multiple RNA secondary structures , 1988, Comput. Appl. Biosci..