Reducing the Worst Case Running Times of a Family of RNA and CFG Problems, Using Valiant's Approach

We study Valiant's classical algorithm for Context Free Grammar recognition in sub-cubic time, and extract features that are common to problems on which Valiant's approach can be applied. Based on this, we describe several problem templates, and formulate generic algorithms that use Valiant's technique and can be applied to all problems which abide by these templates. These algorithms obtain new worst case running time bounds for a large family of important problems within the world of RNA Secondary Structures and Context Free Grammars.

[1]  Rolf Backofen,et al.  Time and Space Efficient RNA-RNA Interaction Prediction via Sparse Folding , 2010, RECOMB.

[2]  Robert Giegerich,et al.  A comprehensive comparison of comparative RNA structure prediction approaches , 2004, BMC Bioinformatics.

[3]  Joan-Andreu Sánchez,et al.  Fast Stochastic Context-Free Parsing: A Stochastic Version of the Valiant Algorithm , 2007, IbPRIA.

[4]  R. Breaker,et al.  Gene regulation by riboswitches , 2004, Nature Reviews Molecular Cell Biology.

[5]  Wojciech Rytter,et al.  Context-Free Recognition via Shortest Paths Computation: A Version of Valiant's Algorithm , 1995, Theor. Comput. Sci..

[6]  J. Baker Trainable grammars for speech recognition , 1979 .

[7]  Sonja J. Prohaska,et al.  RNAs everywhere: genome-wide annotation of structured RNAs. , 2007, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[8]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[9]  Nikhil Bansal,et al.  Regularity Lemmas and Combinatorial Algorithms , 2009, FOCS.

[10]  Tatsuya Akutsu Approximation and Exact Algorithms for RNA Secondary Structure Prediction and Recognition of Stochastic Context-free Languages , 1999, J. Comb. Optim..

[11]  Peter F. Stadler,et al.  Partition function and base pairing probabilities of RNA heterodimers , 2006, Algorithms for Molecular Biology.

[12]  Dan Gusfield,et al.  A Simple, Practical and Complete O(\fracn3 logn)O(\frac{n^3}{ \log n})-Time Algorithm for RNA Folding Using the Four-Russians Speedup , 2009, WABI.

[13]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[14]  Dominique Lavenier,et al.  GPU Accelerated RNA Folding Algorithm , 2009, ICCS.

[15]  Walter L. Ruzzo,et al.  An Improved Context-Free Recognizer , 1980, ACM Trans. Program. Lang. Syst..

[16]  Rolf Backofen,et al.  Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering , 2007, PLoS Comput. Biol..

[17]  Ray Teitelbaum,et al.  Context-free error analysis by evaluation of algebraic power series , 1973, STOC.

[18]  Michal Ziv-Ukelson,et al.  A Study of Accessible Motifs and RNA Folding Complexity , 2007, J. Comput. Biol..

[19]  Ron Shamir,et al.  A Faster Algorithm for RNA Co-folding , 2008, WABI.

[20]  Kaizhong Zhang Computing similarity between RNA secondary structures , 1998, Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174).

[21]  See-Kiong Ng,et al.  A Faster and More Space-Efficient Algorithm for Inferring Arc-Annotations of RNA Sequences through Alignment , 2006, Algorithmica.

[22]  Rolf Backofen,et al.  Sparse RNA folding: Time and space efficient algorithms , 2009, J. Discrete Algorithms.

[23]  Michal Ziv-Ukelson,et al.  Edit Distance with Duplications and Contractions Revisited , 2011, CPM.

[24]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[25]  Ryan Williams,et al.  Matrix-vector multiplication in sub-quadratic time: (some preprocessing required) , 2007, SODA '07.

[26]  R. Motwani,et al.  On Diameter Verification and Boolean Matrix Multiplication. , 1995 .

[27]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[28]  James Demmel,et al.  Benchmarking GPUs to tune dense linear algebra , 2008, HiPC 2008.

[29]  Hamidreza Chitsaz,et al.  A partition function algorithm for interacting nucleic acid strands , 2009, Bioinform..

[30]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[31]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[32]  Dan Gusfield,et al.  A simple, practical and complete O-time Algorithm for RNA folding using the Four-Russians Speedup , 2010, Algorithms for Molecular Biology.

[33]  J. Saxe,et al.  A general method for solving divide-and-conquer recurrences , 1980, SIGA.

[34]  Daniel Kressner,et al.  Block variants of Hammarling's method for solving Lyapunov equations , 2008, TOMS.

[35]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Dan Gusfield,et al.  A simple, practical and complete O(n³/log n)-time algorithm for RNA folding using the four-Russians speedup , 2009, WABI 2009.

[37]  Dan Klein,et al.  A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[38]  Kaizhong Zhang,et al.  RNA-RNA Interaction Prediction and Antisense RNA Target Search , 2006, J. Comput. Biol..

[39]  Leslie G. Valiant,et al.  General Context-Free Recognition in Less than Cubic Time , 1975, J. Comput. Syst. Sci..

[40]  John Cocke,et al.  Programming languages and their compilers: Preliminary notes , 1969 .

[41]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[42]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[43]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[44]  John Cocke,et al.  Programming languages and their compilers , 1969 .

[45]  Dan Gusfield,et al.  A Worst-Case and Practical Speedup for the RNA Co-folding Problem Using the Four-Russians Idea , 2010, WABI.

[46]  Gary D. Stormo,et al.  Pairwise local structural alignment of RNA sequences with sequence similarity less than 40% , 2005, Bioinform..

[47]  Serafim Batzoglou,et al.  CONTRAfold: RNA secondary structure prediction without physics-based models , 2006, ISMB.

[48]  S. Eddy Noncoding RNA genes. , 1999, Current opinion in genetics & development.

[49]  Ming Ouyang,et al.  Accelerating the Nussinov RNA folding algorithm with CUDA/GPU , 2010, The 10th IEEE International Symposium on Signal Processing and Information Technology.

[50]  Robert A. van de Geijn,et al.  Anatomy of high-performance matrix multiplication , 2008, TOMS.

[51]  Sean R. Eddy,et al.  Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[52]  Ron Shamir,et al.  A Faster Algorithm for Simultaneous Alignment and Folding of RNA , 2010, J. Comput. Biol..

[53]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[54]  Timothy M. Chan More Algorithms for All-Pairs Shortest Paths in Weighted Graphs , 2010, SIAM J. Comput..