RNA Pseudoknot Folding through Inference and Identification Using TAGRNA

Studying the structure of RNA sequences is an important problem that helps in understanding the functional properties of RNA. After being ignored for a long time due to the high computational complexity it requires, pseudoknot is one type of RNA structures that has been given a lot of attention lately. Pseudoknot structures have functional importance since they appear, for example, in viral genome RNAs and ribozyme active sites. In this paper, we present a folding framework, TAGRNAInf, for RNA structures that support pseudoknots. Our approach is based on learning TAGRNA grammars from training data with structural information. The inferred grammars are used to indentify sequences with structures analogous to those in the training set and generate a folding for these sequences. We present experimental results and comparisons with other known pseudoknot folding approaches.

[1]  Aravind K. Joshi,et al.  Some Computational Properties of Tree Adjoining Grammars , 1985, ACL.

[2]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[3]  D. Lesemann,et al.  Beet soil-borne virus RNA 2: similarities and dissimilarities to the coat protein gene-carrying RNAs of other furoviruses. , 1997, The Journal of general virology.

[4]  Kelly P. Williams,et al.  The tmRNA Website: invasion by an intron , 2002, Nucleic Acids Res..

[5]  Chantal Ehresmann,et al.  In Vitro Evidence for a Long Range Pseudoknot in the 5′-Untranslated and Matrix Coding Regions of HIV-1 Genomic RNA* , 2002, The Journal of Biological Chemistry.

[6]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[7]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[8]  J. Ng,et al.  PseudoBase: a database with RNA pseudoknots , 2000, Nucleic Acids Res..

[9]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[10]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[11]  G. Ruvkun,et al.  A uniform system for microRNA annotation. , 2003, RNA.

[12]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[13]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[14]  David B. Searls,et al.  The Linguistics of DNA , 1992 .

[15]  D. Haussler,et al.  The Structure of a Rigorously Conserved RNA Element within the SARS Virus Genome , 2004, PLoS biology.

[16]  D. Haussler,et al.  Using multiple alignments and phylogenetic trees to detect RNA secondary structure. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[18]  Esko Ukkonen,et al.  Pattern Discovery in Biosequences , 1998, ICGI.

[19]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[20]  Sanguthevar Rajasekaran,et al.  Improved Algorithms for Parsing ESLTAGs: A Grammatical Model Suitable for RNA Pseudoknots , 2009, ISBRA.

[21]  L. Bosch,et al.  tRNA-like properties of tobacco rattle virus RNA. , 1987, Nucleic acids research.

[22]  Sanguthevar Rajasekaran,et al.  Pseudoknot Identification through Learning TAGRNA , 2008, PRIB.

[23]  Shinnosuke Seki,et al.  Efficient Tree Grammatical Modeling of RNA Secondary Structures from Alignment Data , 2005 .

[24]  B. Ganem RNA world , 1987, Nature.

[25]  Yasuyuki Kurihara,et al.  Imino proton NMR analysis of HDV ribozymes: nested double pseudoknot structure and Mg2+ ion-binding site close to the catalytic core in solution. , 2002, Nucleic acids research.

[26]  Yasubumi Sakakibara,et al.  Grammatical inference in bioinformatics , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  David Haussler,et al.  Identification and Classification of Conserved RNA Secondary Structures in the Human Genome , 2006, PLoS Comput. Biol..

[28]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[29]  E. Buratti,et al.  RNA structure is a key regulatory element in pathological ATM and CFTR pseudoexon inclusion events , 2007, Nucleic acids research.

[30]  K. Vijay-Shankar,et al.  SOME COMPUTATIONAL PROPERTIES OF TREE ADJOINING GRAMMERS , 1985, ACL 1985.

[31]  W. Gilbert Origin of life: The RNA world , 1986, Nature.

[32]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[33]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[34]  Sanguthevar Rajasekaran,et al.  Improved Algorithms for Parsing ESLTAGs: A Grammatical Model Suitable for RNA Pseudoknots , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[35]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[36]  Yanga Byun,et al.  PseudoViewer: web application and web service for visualizing RNA pseudoknots and secondary structures , 2006, Nucleic Acids Res..

[37]  G. Nagaraja,et al.  Identification of Pseudoknots in RNA Secondary Structures : A Grammatical Inference Approach , 2003 .

[38]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..