Thermodynamic Heuristics with Case-Based Reasoning: Combined Insights for RNA Pseudoknot Secondary Structure

Abstract The secondary structure of RNA pseudoknots has been extensively inferred and scrutinized by computational approaches. Experimental methods for determining RNA structure are time consuming and tedious; therefore, predictive computational approaches are required. Predicting the most accurate and energy-stable pseudoknot RNA secondary structure has been proven to be an NP-hard problem. In this paper, a new RNA folding approach, termed MSeeker, is presented; it includes KnotSeeker (a heuristic method) and Mfold (a thermo- dynamic algorithm). The global optimization of this thermodynamic heuristic approach was further enhanced by using a case-based reasoning technique as a local optimization method. MSeeker is a proposed algorithm for predicting RNA pseudoknot structure from individual sequences, especially long ones. This research demonstrates that MSeeker improves the sensitivity and specificity of existing RNA pseudoknot structure predictions. The performance and structural results from this proposed method were evaluated against seven other state- of-the-art pseudoknot prediction methods. The MSeeker method had better sensitivity than the DotKnot, FlexStem, HotKnots, pknotsRG, ILM, NUPACK and pknotsRE methods, with 79% of the predicted pseudoknot base-pairs being correct.

[1]  N. Stanietsky,et al.  The interaction of TIGIT with PVR and PVRL2 inhibits human NK cell cytotoxicity , 2009, Proceedings of the National Academy of Sciences.

[2]  Nur'Aini Abdul Rashid,et al.  A Survey of Compute Intensive Algorithms for Ribo Nucleic Acids Structural Detection , 2009 .

[3]  Byong-Seok Choi,et al.  Rapid preparation of RNA samples for NMR spectroscopy and X-ray crystallography. , 2004, Nucleic acids research.

[4]  Robert L. Tanguay,et al.  A phylogenetically conserved sequence within viral 3' untranslated RNA pseudoknots regulates translation , 1993, Molecular and cellular biology.

[5]  H. Hoos,et al.  HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. , 2005, RNA.

[6]  Tianming Wang,et al.  A Complexity-based Method to Compare RNA Secondary Structures and its Application , 2010, Journal of biomolecular structure & dynamics.

[7]  Chun-Hsiang Huang,et al.  A heuristic approach for detecting RNA H-type pseudoknots , 2005, Bioinform..

[8]  S. Smerdon,et al.  The stimulatory RNA of the Visna-Maedi retrovirus ribosomal frameshifting signal is an unusual pseudoknot with an interstem element. , 2008, RNA.

[9]  Rajaiah Shenbagarathai,et al.  Sequence Analysis, Structure Prediction, and Functional Validation of phaC1/phaC2 Genes of Pseudomonas sp. LDC-25 and Its Importance in Polyhydroxyalkanoate Accumulation , 2009, Journal of biomolecular structure & dynamics.

[10]  Tapash Chandra Ghosh,et al.  Relationship between Gene Compactness and Base Composition in Rice and Human Genome , 2010, Journal of biomolecular structure & dynamics.

[11]  Wei Wang,et al.  Prediction of geometrically feasible three-dimensional structures of pseudoknotted RNA through free energy estimation. , 2009, RNA.

[12]  Angel P. del Pobil,et al.  Tasks and Methods in Applied Artificial Intelligence , 1998, Lecture Notes in Computer Science.

[13]  G. S. Wickham,et al.  Self-cleaving ribozymes of hepatitis delta virus RNA. , 1997, European journal of biochemistry.

[14]  Shicui Zhang,et al.  Evolution of Galanin Receptor Genes: Insights from the Deuterostome Genomes , 2010, Journal of biomolecular structure & dynamics.

[15]  Zhiyong Wang,et al.  FlexStem: improving predictions of RNA secondary structures with pseudoknots by reducing the search space , 2008, Bioinform..

[16]  B. Shapiro,et al.  RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. , 2006, RNA.

[17]  Peter F M Choong,et al.  DNAzyme technology and cancer therapy: cleave and let die , 2008, Molecular Cancer Therapeutics.

[18]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[19]  Farhi Marir,et al.  Case-based reasoning: A review , 1994, The Knowledge Engineering Review.

[20]  Wojciech Kasprzak,et al.  Structural Differentiation of the HIV-1 Poly(A) Signals , 2006, Journal of biomolecular structure & dynamics.

[21]  C. Pleij,et al.  Similarities between the secondary structure of satellite tobacco mosaic virus and tobamovirus RNAs. , 1994, The Journal of general virology.

[22]  Eric Westhof,et al.  The non-Watson-Crick base pairs and their associated isostericity matrices. , 2002, Nucleic acids research.

[23]  D. Baker,et al.  Automated de novo prediction of native-like RNA tertiary structures , 2007, Proceedings of the National Academy of Sciences.

[24]  E. Phizicky,et al.  Saccharomyces cerevisiae tRNA ligase. Purification of the protein and isolation of the structural gene. , 1986, The Journal of biological chemistry.

[25]  C. Pleij,et al.  tRNA‐like structures , 1991 .

[26]  Ian Watson,et al.  Is CBR a Technology or a Methodology? , 1998, IEA/AIE.

[27]  Jeffrey E. Barrick,et al.  Riboswitches Control Fundamental Biochemical Pathways in Bacillus subtilis and Other Bacteria , 2003, Cell.

[28]  Yong-Doo Park,et al.  High-Throughput Integrated Analyses for the Tyrosinase-Induced Melanogenesis: Microarray, Proteomics and Interactomics Studies , 2010, Journal of biomolecular structure & dynamics.

[29]  V. Dolja,et al.  Nucleotide sequence of the 3'-terminal tRNA-like structure in barley stripe mosaic virus genome. , 1984, Nucleic acids research.

[30]  Maung Nyan Win,et al.  RNA as a Versatile and Powerful Platform for Engineering Genetic Regulatory Tools , 2007, Biotechnology & genetic engineering reviews.

[31]  A. Datta,et al.  KnotSeeker: heuristic pseudoknot detection in long RNA sequences. , 2008, RNA.

[32]  Yanga Byun,et al.  PseudoViewer3: generating planar drawings of large-scale RNA structures with pseudoknots , 2009, Bioinform..

[33]  A. Muto,et al.  Three of four pseudoknots in tmRNA are interchangeable and are substitutable with single‐stranded RNAs , 2000, FEBS letters.

[34]  Bruce A Shapiro,et al.  The Impact of Dyskeratosis Congenita Mutations on the Structure and Dynamics of the Human Telomerase RNA Pseudoknot Domain , 2007, Journal of biomolecular structure & dynamics.

[35]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[36]  Rosni Abdullah,et al.  A Comparative Taxonomy of Parallel Algorithms for RNA Secondary Structure Prediction , 2010, Evolutionary bioinformatics online.

[37]  Sanga Mitra,et al.  Multiply Expressed tRNA Genes? , 2010, Journal of biomolecular structure & dynamics.

[38]  Anne Condon,et al.  RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database , 2008, BMC Bioinformatics.

[39]  R. Simons,et al.  RNA structure and function , 1998 .

[40]  Yuan Wang,et al.  Analyzing S-Adenosylhomocysteine Hydrolase Gene Sequences in Deuterostome Genomes , 2009, Journal of biomolecular structure & dynamics.

[41]  Ian D. Watson,et al.  Case-based reasoning is a methodology not a technology , 1999, Knowl. Based Syst..

[42]  K.C. Wiese,et al.  jViz.Rna -a java tool for RNA secondary structure visualization , 2005, IEEE Transactions on NanoBioscience.

[43]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[44]  Charles H. Calisher,et al.  Positive-Strand RNA Viruses , 1994, Archives of Virology Supplementum.

[45]  E. Domingo,et al.  Large deletions in the 5'-untranslated region of foot-and-mouth disease virus of serotype C. , 1995, Virus research.

[46]  Ritwik Mukherjee,et al.  Structural Clones of UAG Decoding RNA , 2009, Journal of biomolecular structure & dynamics.

[47]  C. Greer,et al.  Intron sequence and structure requirements for tRNA splicing in Saccharomyces cerevisiae. , 1988, The Journal of biological chemistry.

[48]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[49]  Mario Stevenson,et al.  Therapeutic potential of RNA interference. , 2004, The New England journal of medicine.

[50]  C. Pleij,et al.  Nemesia ring necrosis virus: a new tymovirus with a genomic RNA having a histidylatable tobamovirus-like 3' end. , 2005, The Journal of general virology.

[51]  Michela Taufer,et al.  PseudoBase++: an extension of PseudoBase for easy searching, formatting and visualization of pseudoknots , 2008, Nucleic Acids Res..

[52]  Vytas Reipa,et al.  Conformational analysis of the telomerase RNA pseudoknot hairpin by Raman spectroscopy. , 2006, RNA.

[53]  Michael M. Richter,et al.  Case-Based Reasoning Research and Development, 7th International Conference on Case-Based Reasoning, ICCBR 2007, Belfast, Northern Ireland, UK, August 13-16, 2007, Proceedings , 2007, ICCBR.

[54]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[55]  David Krason,et al.  Poor Initial CD4+ Recovery With Antiretroviral Therapy Prolongs Immune Depletion and Increases Risk for AIDS and Non-AIDS Diseases , 2008, Journal of acquired immune deficiency syndromes.

[56]  B. Clarke,et al.  Potential secondary and tertiary structure in the genomic RNA of foot and mouth disease virus. , 1987, Nucleic acids research.

[57]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[58]  J P Abrahams,et al.  Five pseudoknots are present at the 204 nucleotides long 3' noncoding region of tobacco mosaic virus RNA. , 1985, Nucleic acids research.

[59]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[60]  Marc A. Martí-Renom,et al.  Quantifying the relationship between sequence and three-dimensional structure conservation in RNA , 2009, BMC Bioinformatics.

[61]  Feng-Biao Guo,et al.  Identify Protein-coding Genes in the Genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS , 2009, Journal of biomolecular structure & dynamics.

[62]  Wei Huang,et al.  A Novel Method to Analyze the Similarity of Biological Sequences , 2009, Journal of biomolecular structure & dynamics.

[63]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[64]  Xiuzi Ye,et al.  Fuzzy Kernel Clustering of RNA Secondary Structure Ensemble Using a Novel Similarity Metric , 2008, Journal of biomolecular structure & dynamics.

[65]  Ramanathan Sowdhamini,et al.  Phylogenetic Analysis and Selection Pressures of 5-HT Receptors in Human and Non-human Primates: Receptor of an Ancient Neurotransmitter , 2010, Journal of biomolecular structure & dynamics.

[66]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[67]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[68]  D. Giedroc,et al.  Structure of the autoregulatory pseudoknot within the gene 32 messenger RNA of bacteriophages T2 and T6: a model for a possible family of structurally related RNA pseudoknots. , 1996, Biochemistry.

[69]  Fei Zou,et al.  Towards Profiling the Gene Expression of Tyrosinase-induced Melanogenesis in HEK293 Cells: a Functional DNA Chip Microarray and Interactomics Studies , 2009, Journal of biomolecular structure & dynamics.

[70]  Andrew E Firth,et al.  A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting , 2009, Virology Journal.

[71]  Jana Sperschneider,et al.  DotKnot: pseudoknot prediction using the probability dot plot under a refined energy model , 2010, Nucleic acids research.

[72]  R. Joshi,et al.  Interaction of turnip yellow mosaic virus Val‐RNA with eukaryotic elongation factor EF‐1 [alpha]. Search for a function. , 1986, The EMBO journal.

[73]  Eckart Bindewald,et al.  CyloFold: secondary structure prediction including pseudoknots , 2010, Nucleic Acids Res..

[74]  Jinbu Wang,et al.  Rapid global structure determination of large RNA and RNA complexes using NMR and small-angle X-ray scattering. , 2010, Methods.

[75]  Qing Yang,et al.  Alignment-free Comparison of Protein Sequences Based on Reduced Amino Acid Alphabets , 2009, Journal of biomolecular structure & dynamics.

[76]  Eckart Bindewald,et al.  RNAJunction: a database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign , 2007, Nucleic Acids Res..

[77]  Michael Levitt,et al.  Describing RNA structure by libraries of clustered nucleotide doublets. , 2005, Journal of molecular biology.

[78]  F. Major,et al.  The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data , 2008, Nature.

[79]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[80]  C. Mitra,et al.  Conserved Short Sequences in Promoter Regions of Human Genome , 2010, Journal of biomolecular structure & dynamics.

[81]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[82]  Herbert H. Tsang,et al.  SARNA-Predict: Accuracy Improvement of RNA Secondary Structure Prediction Using Permutation-Based Simulated Annealing , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[83]  I. Brierley,et al.  Viral RNA pseudoknots: versatile motifs in gene expression and replication , 2007, Nature Reviews Microbiology.

[84]  Hugo Naya,et al.  Composition Profile of the Human Genome at the Chromosome Level , 2009, Journal of biomolecular structure & dynamics.