Phylogenetically enhanced statistical tools for RNA structure prediction

MOTIVATION Methods that predict the structure of molecules by looking for statistical correlation have been quite effective. Unfortunately, these methods often disregard phylogenetic information in the sequences they analyze. Here, we present a number of statistics for RNA molecular-structure prediction. Besides common pair-wise comparisons, we consider a few reasonable statistics for base-triple predictions, and present an elaborate analysis of these methods. All these statistics incorporate phylogenetic relationships of the sequences in the analysis to varying degrees, and the different nature of these tests gives a wide choice of statistical tools for RNA structure prediction. RESULTS Starting from statistics that incorporate phylogenetic information only as independent sequence evolution models for each position of a multiple alignment, and extending this idea to a joint evolution model of two positions, we enhance the usual purely statistical methods (e.g. methods based on the Mutual Information statistic) with the use of phylogenetic information available in the sequences. In particular, we present a joint model based on the HKY evolution model, and consequently a X(2) test of independence for two positions. A significant part of this work is devoted to some mathematical analysis of these methods. We tested these statistics on regions of 16S and 23S rRNA, and tRNA.

[1]  K Lange,et al.  Computational advances in maximum likelihood methods for molecular phylogeny. , 1998, Genome research.

[2]  S. Jeffery Evolution of Protein Molecules , 1979 .

[3]  Yves Van de Peer,et al.  Database on the structure of small ribosomal subunit RNA , 1996, Nucleic Acids Res..

[4]  N. Pace,et al.  Phylogenetic comparative analysis of RNA secondary structure. , 1989, Methods in enzymology.

[5]  D. Anderson,et al.  Algorithms for minimization without derivatives , 1974 .

[6]  David Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[7]  Ross A. Overbeek,et al.  Structure detection through automated covariance search , 1990, Comput. Appl. Biosci..

[8]  M. Zuker,et al.  Predicting common foldings of homologous RNAs. , 1991, Journal of biomolecular structure & dynamics.

[9]  Kyungsook Han,et al.  Prediction of common folding structures of homologous RNAs. , 1993, Nucleic acids research.

[10]  R. Gutell,et al.  Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. , 1994, Microbiological reviews.

[11]  Arnold Neumaier,et al.  Introduction to Numerical Analysis , 2001 .

[12]  Gary D. Stormo,et al.  A Phylogenetic Approach to RNA Structure Prediction , 1999, ISMB.

[13]  D Gautheret,et al.  Identification of base-triples in RNA using comparative sequence analysis. , 1995, Journal of molecular biology.

[14]  S. Muse Evolutionary analyses of DNA sequences subject to constraints of secondary structure. , 1995, Genetics.

[15]  Gary D. Stormo,et al.  Graph-Theoretic Approach to RNA Modeling Using Comparative Data , 1995, ISMB.

[16]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[17]  M Ikehara,et al.  Hydrogen exchange kinetics of nucleic acids. Double and triple helices with Hoogsteen-type basepairs. , 1982, Biochimica et biophysica acta.

[18]  M. Zuker,et al.  Structural analysis by energy dot plot of a large mRNA. , 1993, Journal of molecular biology.

[19]  D. Haussler,et al.  Using multiple alignments and phylogenetic trees to detect RNA secondary structure. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[20]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[21]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[22]  R. Gutell,et al.  A functional ribosomal RNA tertiary structure involves a base triple interaction. , 1998, Biochemistry.

[23]  R. Gutell,et al.  Detailed analysis of the higher-order structure of 16S-like ribosomal ribonucleic acids. , 1983, Microbiological reviews.

[24]  R. Gutell,et al.  Collection of small subunit (16S- and 16S-like) ribosomal RNA structures: 1994. , 1993, Nucleic acids research.

[25]  Yves Van de Peer,et al.  Database on the structure of small ribosomal subunit RNA , 1998, Nucleic Acids Res..

[26]  Begnaud Francis Hildebrand,et al.  Introduction to numerical analysis: 2nd edition , 1987 .

[27]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[28]  E. Westhof,et al.  Modelling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. , 1990, Journal of molecular biology.

[29]  G. Stormo,et al.  Correlated mutations in protein sequences: Phylogenetic and structural effects , 1997 .