Toward Turbo Decoding of RNA Secondary Structure

We propose an iterative probabilistic algorithm for estimation of RNA secondary structure using sequence data from two homologous sequences. The method is intended to exploit intersequence correlations "encoded" in the form of probabilistic models for alignment and for common secondary structure. In analogy with turbo-decoding in digital communications, we formulate a maximum a posteriori probability objective function for joint structural prediction and sequence alignment using iterations over individual structural and sequential alignment models with soft-input soft-output estimators. As a preliminary step toward realizing this methodology, we present results obtained from incorporating (hard) constraints based on posterior sequence alignment probabilities in joint secondary structure prediction. Through experimental evaluations over available databases of known secondary structure, we demonstrate that this results in a significant decrease in computation time while simultaneously providing a marginal increase in structural prediction accuracy.

[1]  David H. Mathews,et al.  Predicting a set of minimal free energy RNA secondary structures common to two sequences , 2005, Bioinform..

[2]  Ian Holmes,et al.  Stem Stem Stem Stem Loop Loop Loop LoopLoop Loop Loop Loop Loop Loop Loop , 2005 .

[3]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[4]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[5]  D. Turner,et al.  Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. , 2002, Journal of molecular biology.

[6]  Gaurav Sharma,et al.  Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign , 2007, BMC Bioinformatics.

[7]  Miroslawa Z. Barciszewska,et al.  5S ribosomal RNA database Y2K , 2000, Nucleic Acids Res..

[8]  Mathias Sprinzl,et al.  Compilation of tRNA sequences and sequences of tRNA genes , 1993, Nucleic Acids Res..

[9]  Sean R. Eddy,et al.  Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[10]  Marcel Turcotte,et al.  Simultaneous alignment and structure prediction of three RNA sequences , 2005, Int. J. Bioinform. Res. Appl..

[11]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[12]  David H. Mathews,et al.  Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change , 2006, BMC Bioinformatics.

[13]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[14]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[15]  M. Zuker Computer prediction of RNA structure. , 1989, Methods in enzymology.

[16]  Jeffrey W. Roberts,et al.  遺伝子の分子生物学 = Molecular biology of the gene , 1970 .