Lifting Prediction to Alignment of RNA Pseudoknots

Prediction and alignment of RNA pseudoknot structures are NP-hard. Nevertheless, several efficient prediction algorithms by dynamic programming have been proposed for restricted classes of pseudoknots. We present a general scheme that yields an efficient alignment algorithm for arbitrary such classes. Moreover, we show that such an alignment algorithm benefits from the class restriction in the same way as the corresponding structure prediction algorithm does. We look at six of these classes in greater detail. The time and space complexity of the alignment algorithm is increased by only a linear factor over the respective prediction algorithm. For five of the classes, no efficient alignment algorithms were known. For the sixth, most general class, we improve the previously best complexity of O(n(5)m(5)) time to O(nm(6)), where n and m denote sequence lengths. Finally, we apply our fastest algorithm with O(nm(4)) time and O(nm(2)) space to comparative de-novo pseudoknot prediction.

[1]  Rolf Backofen,et al.  Methods for multiple alignment and consensus structure prediction of RNAs implemented in MARNA. , 2007, Methods in molecular biology.

[2]  Rolf Backofen,et al.  Fixed Parameter Tractable Alignment of RNA Structures Including Arbitrary Pseudoknots , 2008, CPM.

[3]  Hosna Jabbari,et al.  An O(n5) Algorithm for MFE Prediction of Kissing Hairpins and 4-Chains in Nucleic Acids , 2009, J. Comput. Biol..

[4]  Donald Kennedy,et al.  Breakthrough of the Year , 2007, Science.

[5]  D. W. Staple,et al.  Open access, freely available online Primer Pseudoknots: RNA Structures with Diverse Functions , 2022 .

[6]  D. Turner,et al.  Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. , 2002, Journal of molecular biology.

[7]  Peter F. Stadler,et al.  Non-coding RNAs in Ciona intestinalis , 2005, ECCB/JBI.

[8]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[9]  Patricia A. Evans Finding Common Subsequences with Arcs and Pseudoknots , 1999, CPM.

[10]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[11]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[12]  Patricia A. Evans Finding common RNA pseudoknot structures in polynomial time , 2011, J. Discrete Algorithms.

[13]  Christian N. S. Pedersen,et al.  Pseudoknots in RNA Secondary Structures , 2000 .

[14]  Rolf Backofen,et al.  Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering , 2007, PLoS Comput. Biol..

[15]  A Xayaphoummine,et al.  Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  P. Stadler,et al.  Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome , 2005, Nature Biotechnology.

[17]  Jan Gorodkin,et al.  Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix , 2007, PLoS Comput. Biol..

[18]  Bin Ma,et al.  A General Edit Distance between RNA Structures , 2002, J. Comput. Biol..

[19]  Laurie J. Heyer,et al.  Finding the most significant common sequence and structure motifs in a set of RNA sequences. , 1997, Nucleic acids research.

[20]  J. Couzin Small RNAs Make Big Splash , 2002, Science.

[21]  Jitender S. Deogun,et al.  RNA Secondary Structure Prediction with Simple Pseudoknots , 2004, APBC.

[22]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[23]  Rolf Backofen,et al.  Backofen R: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons , 2005 .

[24]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[25]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[26]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[27]  Knut Reinert,et al.  Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization , 2007, BMC Bioinformatics.

[28]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[29]  Gaurav Sharma,et al.  Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign , 2007, BMC Bioinformatics.

[30]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[31]  Jennifer Couzin,et al.  Small RNAs Make Big Splash , 2002, Science.

[32]  Anne Condon,et al.  Classifying RNA pseudoknotted structures , 2004, Theor. Comput. Sci..

[33]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[34]  Serafim Batzoglou,et al.  CONTRAfold: RNA secondary structure prediction without physics-based models , 2006, ISMB.

[35]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.