RNA structural alignments, part II: non-Sankoff approaches for structural alignments.

In structural alignments of RNA sequences, the computational cost of Sankoff algorithm, which simultaneously optimizes the score of the common secondary structure and the score of the alignment, is too high for long sequences (O(L (6)) time for two sequences of length L). In this chapter, we introduce the methods that predict the structures and the alignment separately to avoid the heavy computations in Sankoff algorithm. In those methods, neither of those two prediction processes is independent, but each of them utilizes the information of the other process. The first process typically includes prediction of base-pairing probabilities (BPPs) or the candidates of the stems, and the alignment process utilizes those results. At the same time, it is also important to reflect the information of the alignment to the structure prediction. This idea can be implemented as the probabilistic transformation (PCT) of BPPs using the potential alignment. As same as for all the estimation problems, it is important to define the evaluation measure for the structural alignment. The principle of maximum expected accuracy (MEA) is applicable for sum-of-pairs (SPS) score based on the reference alignment.

[1]  Knut Reinert,et al.  Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization , 2007, BMC Bioinformatics.

[2]  S. Miyazawa A reliable sequence alignment method based on probabilities of residue correspondences. , 1995, Protein engineering.

[3]  Byung-Jun Yoon,et al.  PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple sequences , 2010, Nucleic acids research.

[4]  Ioannis Xenarios,et al.  R-Coffee: a web server for accurately aligning noncoding RNA sequences , 2008, Nucleic Acids Res..

[5]  Chuong B. Do,et al.  ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.

[6]  Zasha Weinberg,et al.  CMfinder - a covariance model based RNA motif finding algorithm , 2006, Bioinform..

[7]  Yasuo Tabei,et al.  Murlet: a practical multiple alignment tool for structural RNA sequences , 2007, Bioinform..

[8]  Yasuo Tabei,et al.  A fast structural multiple alignment method for long RNA sequences , 2008, BMC Bioinformatics.

[9]  Byung-Jun Yoon,et al.  PicXAA-R: Efficient structural alignment of multiple RNA sequences using a greedy approach , 2011, BMC Bioinformatics.

[10]  Kiyoshi Asai,et al.  CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score , 2009, Bioinform..

[11]  Deniz Dalli,et al.  StrAl: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time , 2006, Bioinform..

[12]  Hélène Touzet,et al.  Comparative Analysis of RNA Genes , 2007 .

[13]  Kazutaka Katoh,et al.  Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework , 2008, BMC Bioinformatics.

[14]  Paul P. Gardner,et al.  MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing , 2007, Bioinform..

[15]  Yasuo Tabei,et al.  SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments , 2006, Bioinform..

[16]  Hélène Touzet,et al.  CARNAC: folding families of related RNAs , 2004, Nucleic Acids Res..

[17]  D. Higgins,et al.  R-Coffee: a method for multiple alignment of non-coding RNA , 2008, Nucleic acids research.

[18]  Kiyoshi Asai,et al.  Generalized Centroid Estimators in Bioinformatics , 2011, PloS one.

[19]  István Miklós,et al.  SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework , 2007, PLoS Comput. Biol..