Stochastic context-free grammars for tRNA modeling.

Stochastic context-free grammars (SCFGs) are applied to the problems of folding, aligning and modeling families of tRNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. Results show that after having been trained on as few as 20 tRNA sequences from only two tRNA subfamilies (mitochondrial and cytoplasmic), the model can discern general tRNA from similar-length RNA sequences of other kinds, can find secondary structure of new tRNA sequences, and can produce multiple alignments of large sets of tRNA sequences. Our results suggest potential improvements in the alignments of the D- and T-domains in some mitochondrial tRNAs that cannot be fit into the canonical secondary structure.

[1]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[2]  Journal of Molecular Biology , 1959, Nature.

[3]  B. Hayes The American Scientist , 1962, Nature.

[4]  T. Creighton Methods in Enzymology , 1968, The Yale Journal of Biology and Medicine.

[5]  F. F. Kuo,et al.  PROCEEDINGS OF THE HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES HELD JANUARY 29-30, 1968, , 1968 .

[6]  N. S. Barnett,et al.  Private communication , 1969 .

[7]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[8]  J. Goddard Transfer RNA , 1980, Nature.

[9]  O. Firschein,et al.  Syntactic pattern recognition and applications , 1983, Proceedings of the IEEE.

[10]  T. Pollard,et al.  Annual review of biophysics and biophysical chemistry , 1985 .

[11]  Roger K. Moore Computer Speech and Language , 1986 .

[12]  M. Bishop,et al.  Nucleic acid and protein sequence analysis : a practical approach , 1987 .

[13]  RNA processing. Part A. General methods. , 1989, Methods in enzymology.

[14]  M. Waterman Mathematical Methods for DNA Sequences , 1989 .

[15]  Joost Engelfriet,et al.  Graph Grammars Based on Node Rewriting: An Introduction to NLC Graph Grammars , 1990, Graph-Grammars and Their Application to Computer Science.

[16]  D. Labie,et al.  Molecular Evolution , 1991, Nature.

[17]  Lawrence Hunter,et al.  Artificial Intelligence and Molecular Biology , 1992, AI Mag..

[18]  T. Pollard,et al.  Annual review of biophysics and biomolecular structure , 1992 .

[19]  Michael,et al.  The Application of Stochastic Context-Free Grammarsto Folding , Aligning and Modeling Homologous RNA , 1993 .

[20]  David B. Searls,et al.  The computational linguistics of biological sequences , 1993, ISMB 1995.

[21]  R. C. Underwood,et al.  THE APPLICATION OF STOCHASTIC CONTEXT-FREE GRAMMARS TO FOLDING, ALIGNING AND MODELING HOMOLOGOUS RNA SEQUENCES , 1993 .

[22]  D. Haussler,et al.  A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.

[23]  Lawrence Hunter,et al.  The First International Conference on Intelligent Systems for Molecular Biology , 1994, AI Mag..

[24]  David Haussler,et al.  RNA Modeling Using Gibbs Sampling and Stochastic Context Free Grammars , 1994, ISMB.

[25]  Pieter W. G. Bots,et al.  Proceedings of the 27th Hawaii International Conference on Systems Sciences , 1995 .

[26]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.