Improved Algorithms for Parsing ESLTAGs: A Grammatical Model Suitable for RNA Pseudoknots

Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammars (SLTAGs) and Extended SLTAGs (ESLTAGs). Performances of techniques that employ grammatical formalisms critically depend on the efficiency of the underlying parsing algorithms. In this paper, we present efficient algorithms for parsing SLTAGs and ESLTAGs. Our algorithm for SLTAGs parsing takes O(min{m, n4}) time and O(min{m, n4}) space, where m is the number of entries that will ever be made in the matrix M (that is normally used by TAG parsing algorithms). Our algorithm for ESLTAGs parsing takes O(nmin{m, n4}) time and O(min{m, n4}) space. We show that these algorithms perform better, in practice, than the algorithms of Uemura et al..

[1]  Michela Taufer,et al.  PseudoBase++: an extension of PseudoBase for easy searching, formatting and visualization of pseudoknots , 2008, Nucleic Acids Res..

[2]  Aravind K. Joshi,et al.  An Earley-Type Parsing Algorithm for Tree Adjoining Grammars , 1988, ACL.

[3]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[4]  Sanguthevar Rajasekaran,et al.  Pseudoknot Identification through Learning TAGRNA , 2008, PRIB.

[5]  Kelly P. Williams,et al.  The tmRNA Website: invasion by an intron , 2002, Nucleic Acids Res..

[6]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[7]  Vipin Kumar,et al.  A parallel parsing algorithm for natural language using tree adjoining grammar , 1994, Proceedings of 8th International Parallel Processing Symposium.

[8]  Karin Harbusch,et al.  An Efficient Parsing Algorithm for Tree Adjoining Grammars , 1990, ACL.

[9]  Aravind K. Joshi,et al.  Some Computational Properties of Tree Adjoining Grammars , 1985, Annual Meeting of the Association for Computational Linguistics.

[10]  Sanguthevar Rajasekaran Tree-Adjoining Language Parsing in o(n^6) Time , 1996, SIAM J. Comput..

[11]  Yanga Byun,et al.  PseudoViewer: web application and web service for visualizing RNA pseudoknots and secondary structures , 2006, Nucleic Acids Res..

[12]  Hiroshi Matsui,et al.  Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[13]  Chantal Ehresmann,et al.  In Vitro Evidence for a Long Range Pseudoknot in the 5′-Untranslated and Matrix Coding Regions of HIV-1 Genomic RNA* , 2002, The Journal of Biological Chemistry.

[14]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[15]  J. Ng,et al.  PseudoBase: a database with RNA pseudoknots , 2000, Nucleic Acids Res..

[16]  David S. L. Wei,et al.  An Optimal Linear-Time Parallel Parser for Tree Adjoining Languages , 1990, SIAM J. Comput..

[17]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[18]  Aravind K. Joshi,et al.  Some Computational Properties of Tree Adjoining Grammars , 1985, ACL.

[19]  SANGUTHEVAR RAJASEKARANyAbstract,et al.  Tal Parsing in O(n 6 ) Time , 2022 .

[20]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[21]  Sanguthevar Rajasekaran,et al.  TAL Recognition in O(M(n»)) Time , 1998, J. Comput. Syst. Sci..

[22]  Sanguthevar Rajasekaran,et al.  RNA Pseudoknot Folding through Inference and Identification Using TAGRNA , 2009, BICoB.

[23]  Giorgio Satta,et al.  Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication , 1994, Comput. Linguistics.