An algorithm for comparing RNA secondary structures and searching for similar substructures

To access the functional informations carried by RNA molecules at the level of their secondary structure interactions, we propose a comparison method based on a tree edit algorithm which takes into account the tree structure of RNA foldings. Any secondary structure is translated into a tree involving all its elementary substructures; then a shorter condensed tree is built in which any unbranched helix interspersed with bulges and interior loops is taken as a single node. This method includes several parameters: a comparison matrix between structural units, gap penalties, and the scoring between nodes of the condensed trees. Their effects have been analysed using as a model a rapidly divergent domain of the large ribosomal RNA, for which structural variation during evolution is well known. This method allows one to recognize precisely, in large target molecules, definite substructures that present with the query molecules only a limited set of closely related secondary structure features; it is still efficient if intervening features, which can correspond to insertion/deletion of entire stem regions, separate such structural elements. When coupled with a hierarchical clustering algorithm, this method is suitable for classifying RNA molecules according to their secondary structure homologies.

[1]  J. Bachellerie,et al.  Comparisons of large subunit rRNAs reveal some eukaryote-specific elements of secondary structure. , 1987, Biochimie.

[2]  Bruce A. Shapiro,et al.  An algorithm for comparing multiple RNA secondary structures , 1988, Comput. Appl. Biosci..

[3]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[4]  Ruth Nussinov,et al.  RNA secondary structures: comparison and determination of frequently recurring substructures by consensus , 1989, Comput. Appl. Biosci..

[5]  J. Bachellerie,et al.  Evolution of large-subunit rRNA structure. The diversification of divergent D3 domain among major phylogenetic groups. , 1990, European journal of biochemistry.

[6]  O. Uhlenbeck,et al.  Studies on the hammerhead RNA self-cleaving domain. , 1989, Gene.

[7]  F. Corpet Multiple sequence alignment with hierarchical clustering. , 1988, Nucleic acids research.

[8]  Temple F. Smith,et al.  Comparison of biosequences , 1981 .

[9]  J. Steitz,et al.  The U3 small nucleolar ribonucleoprotein functions in the first step of preribosomal RNA processing , 1990, Cell.

[10]  S. Gerbi Evolution of Ribosomal DNA , 1985 .

[11]  R. Brimacombe,et al.  Structure and function of ribosomal RNA. , 1985, The Biochemical journal.

[12]  J. Devereux,et al.  A comprehensive set of sequence analysis programs for the VAX , 1984, Nucleic Acids Res..

[13]  W. Musters,et al.  Evolutionary conservation of structure and function of high molecular weight ribosomal RNA. , 1988, Progress in biophysics and molecular biology.

[14]  J. Bachellerie,et al.  Evolution of large subunit rRNA structure. The 3' terminal domain contains elements of secondary structure specific to major phylogenetic groups. , 1989, Biochimie.

[15]  P. Hogeweg,et al.  Pattern analysis of RNA secondary structure similarity and consensus of minimal-energy folding. , 1989, Journal of molecular biology.

[16]  Manolo Gouy,et al.  Prédiction des structures secondaires dans les acides nucléiques: aspects algorithmiques et physiques , 1985 .

[17]  J. Bachellerie,et al.  Secondary structure of mouse 28S rRNA and general model for the folding of the large rRNA in eukaryotes. , 1984, Nucleic acids research.

[18]  K. Umesono,et al.  Comparative and functional anatomy of group II catalytic introns--a review. , 1989, Gene.

[19]  T. Cech,et al.  Conserved sequences and structures of group I introns: building an active site for RNA catalysis--a review. , 1988, Gene.

[20]  N. Pace,et al.  Phylogenetic comparative analysis and the secondary structure of ribonuclease P RNA--a review. , 1989, Gene.

[21]  J. Bachellerie,et al.  Secondary structure of the 5' external transcribed spacer of vertebrate pre-rRNA. Presence of phylogenetically conserved features. , 1991, European journal of biochemistry.