Bioinformatics of prokaryotic RNAs

The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.

[1]  Peter F. Stadler,et al.  Lacking alignments? The next-generation sequencing mapper segemehl revisited , 2014, Bioinform..

[2]  S. Caboche,et al.  Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data , 2014, BMC Genomics.

[3]  Peter F. Stadler,et al.  TSSAR: TSS annotation regime for dRNA-seq data , 2014, BMC Bioinformatics.

[4]  Andrea Tanzer,et al.  A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection , 2014, Genome Biology.

[5]  J. Harrow,et al.  Systematic evaluation of spliced alignment programs for RNA-seq data , 2013, Nature Methods.

[6]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[7]  A. Becker,et al.  Small RNA sX13: A Multifaceted Regulator of Virulence in the Plant Pathogen Xanthomonas , 2013, PLoS pathogens.

[8]  Andreas S. Richter,et al.  Comparative genomics boosts target prediction for bacterial small RNAs , 2013, Proceedings of the National Academy of Sciences.

[9]  Rolf Backofen,et al.  CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems , 2013, Nucleic acids research.

[10]  P. Stadler,et al.  Identification of new protein coding sequences and signal peptidase cleavage sites of Helicobacter pylori strain 26695 by proteogenomics. , 2013, Journal of proteomics.

[11]  Ümit V. Çatalyürek,et al.  Benchmarking short sequence mapping tools , 2013, BMC Bioinformatics.

[12]  Kay Nieselt,et al.  High-Resolution Transcriptome Maps Reveal Strain-Specific Regulatory Features of Multiple Campylobacter jejuni Isolates , 2013, PLoS genetics.

[13]  Rolf Backofen,et al.  LocARNAscan: Incorporating thermodynamic stability in sequence and structure-based RNA homology search , 2013, Algorithms for Molecular Biology.

[14]  Kay Nieselt,et al.  Automated transcription start site prediction for comparative Transcriptomics using the SuperGenome , 2013 .

[15]  Rolf Backofen,et al.  SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics , 2013, RECOMB.

[16]  D. Mathews,et al.  Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots , 2013, Proceedings of the National Academy of Sciences.

[17]  Michael P Snyder,et al.  SeqFold: Genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data , 2013, Genome research.

[18]  Sean R. Eddy,et al.  Rfam 11.0: 10 years of RNA families , 2012, Nucleic Acids Res..

[19]  Irmtraud M. Meyer,et al.  The hok mRNA family , 2012, RNA biology.

[20]  Peter Clote,et al.  Integrating Chemical Footprinting Data into RNA Secondary Structure Prediction , 2012, PloS one.

[21]  Emanuel Barth,et al.  SR1—a small RNA with two remarkably conserved functions , 2012, Nucleic acids research.

[22]  Andreas S. Richter,et al.  An archaeal sRNA targeting cis- and trans-encoded mRNAs via two distinct domains , 2012, Nucleic acids research.

[23]  Jan Gorodkin,et al.  RIsearch: fast RNA–RNA interaction search using a simplified nearest-neighbor energy model , 2012, Bioinform..

[24]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[25]  Véronique Martin,et al.  Mapping Reads on a Genomic Sequence: An Algorithmic Overview and a Practical Comparative Analysis , 2012, J. Comput. Biol..

[26]  Rolf Backofen,et al.  GraphClust: alignment-free structural clustering of local RNA secondary structures , 2012, Bioinform..

[27]  P. Stadler,et al.  LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. , 2012, RNA.

[28]  Pimlapas Leekitcharoenphon,et al.  The transcriptional landscape and small RNAs of Salmonella enterica serovar Typhimurium , 2012, Proceedings of the National Academy of Sciences.

[29]  F. Narberhaus,et al.  Bacterial RNA thermometers: molecular zippers and switches , 2012, Nature Reviews Microbiology.

[30]  Rolf Backofen,et al.  Global or local? Predicting secondary structure and accessibility in mRNAs , 2012, Nucleic acids research.

[31]  R. Romero-Záliz,et al.  A survey of sRNA families in α-proteobacteria , 2012, RNA biology.

[32]  Manolis Kellis,et al.  RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction , 2012, Nucleic acids research.

[33]  Christian M. Reidys,et al.  Addendum: topology and prediction of RNA pseudoknots , 2012, Bioinform..

[34]  P. Stadler,et al.  Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions , 2011, Nucleic acids research.

[35]  Daniel Gautheret,et al.  NAPP: the Nucleic Acid Phylogenetic Profile Database , 2011, Nucleic Acids Res..

[36]  B. Tjaden,et al.  Biocomputational Identification of Bacterial Small RNAs and Their Target Binding Sites , 2012 .

[37]  David H Mathews,et al.  RNA structure prediction: an overview of methods. , 2012, Methods in molecular biology.

[38]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[39]  Sonja J. Prohaska,et al.  Protein-coding structured RNAs: A computational survey of conserved RNA secondary structures overlapping coding regions in drosophilids. , 2011, Biochimie.

[40]  Matthew Ruffalo,et al.  Comparative analysis of algorithms for next-generation sequencing read alignment , 2011, Bioinform..

[41]  R. Breaker Prospects for riboswitch discovery and analysis. , 2011, Molecular cell.

[42]  G. Storz,et al.  Regulation by small RNAs in bacteria: expanding frontiers. , 2011, Molecular cell.

[43]  H. Ochman,et al.  Genome-wide detection of novel regulatory RNAs in E. coli. , 2011, Genome research.

[44]  J. Vogel,et al.  Pervasive post‐transcriptional control of genes involved in amino acid metabolism by the Hfq‐dependent GcvB small RNA , 2011, Molecular microbiology.

[45]  J. Vogel,et al.  Hfq and its constellation of RNA , 2011, Nature Reviews Microbiology.

[46]  Qian Liu,et al.  sTarPicker: A Method for Efficient Prediction of Bacterial sRNA Targets Based on a Two-Step Model for Hybridization , 2011, PloS one.

[47]  Peter F. Stadler,et al.  Fast accessibility-based prediction of RNA-RNA interactions , 2011, Bioinform..

[48]  Peter F. Stadler,et al.  A folding algorithm for extended RNA secondary structures , 2011, Bioinform..

[49]  Peter F. Stadler,et al.  RNApredator: fast accessibility-based prediction of sRNA targets , 2011, Nucleic Acids Res..

[50]  Matthias Zytnicki,et al.  BlastR—fast and accurate database searches for non-coding RNAs , 2011, Nucleic acids research.

[51]  Rolf Backofen,et al.  The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences , 2011, Nucleic Acids Res..

[52]  Rolf Backofen,et al.  The small RNA PhrS stimulates synthesis of the Pseudomonas aeruginosa quinolone signal , 2011, Molecular microbiology.

[53]  C. Locht,et al.  Detection of small RNAs in Bordetella pertussis and identification of a novel repeated genetic element , 2011, BMC Genomics.

[54]  Steve Hoffmann,et al.  Traces of post-transcriptional RNA modifications in deep sequencing data , 2011, Biological chemistry.

[55]  Nick Goldman,et al.  RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data. , 2011, RNA.

[56]  Christian M. Reidys,et al.  Topology and prediction of RNA pseudoknots , 2011, Bioinform..

[57]  H. Margalit,et al.  Accessibility and Evolutionary Conservation Mark Bacterial Small-RNA Target-Binding Regions , 2011, Journal of bacteriology.

[58]  J. Vogel,et al.  An experimentally anchored map of transcriptional start sites in the model cyanobacterium Synechocystis sp. PCC6803 , 2011, Proceedings of the National Academy of Sciences.

[59]  Rolf Backofen,et al.  PETcofold: predicting conserved interactions and structures of two multiple alignments of RNA sequences , 2010, Bioinform..

[60]  Robert D. Finn,et al.  Rfam: Wikipedia, clans and the “decimal” release , 2010, Nucleic Acids Res..

[61]  Henri Orland,et al.  TT2NE: a novel algorithm to predict RNA secondary structures with pseudoknots , 2010, Nucleic acids research.

[62]  Christian M. Reidys,et al.  RNA-RNA interaction prediction based on multiple sequence alignments , 2010, Bioinform..

[63]  J. Elf,et al.  RNAs actively cycle on the Sm-like protein Hfq. , 2010, Genes & development.

[64]  P. Liu,et al.  The role of microRNAs in acute myeloid leukemia , 2010, F1000 biology reports.

[65]  K. Weeks,et al.  SHAPE-directed RNA secondary structure prediction. , 2010, Methods.

[66]  Rolf Backofen,et al.  Sparsification of RNA structure prediction including pseudoknots , 2010, Algorithms for Molecular Biology.

[67]  Howard Y. Chang,et al.  Genome-wide measurement of RNA secondary structure in yeast , 2010, Nature.

[68]  Fabrizio Costa,et al.  Fast Neighborhood Subgraph Pairwise Distance Kernel , 2010, ICML.

[69]  Andreas S. Richter,et al.  The small RNA Aar in Acinetobacter baylyi: a putative regulator of amino acid metabolism , 2010, Archives of Microbiology.

[70]  Eckart Bindewald,et al.  CyloFold: secondary structure prediction including pseudoknots , 2010, Nucleic Acids Res..

[71]  Rolf Backofen,et al.  Hierarchical folding of multiple sequence alignments for the prediction of structures and RNA-RNA interactions , 2010, Algorithms for Molecular Biology.

[72]  Rolf Backofen,et al.  Freiburg RNA Tools: a web server integrating IntaRNA, ExpaRNA and LocARNA , 2010, Nucleic Acids Res..

[73]  Rolf Backofen,et al.  Time and Space Efficient RNA-RNA Interaction Prediction via Sparse Folding , 2010, RECOMB.

[74]  Kristin Reiche,et al.  The primary transcriptome of the major human pathogen Helicobacter pylori , 2010, Nature.

[75]  F. Vandenesch,et al.  Staphylococcus aureus RNAIII Binds to Two Distant Regions of coa mRNA to Arrest Translation and Promote mRNA Degradation , 2010, PLoS pathogens.

[76]  P. Stadler,et al.  A novel family of plasmid-transferred anti-sense ncRNAs , 2010, RNA biology.

[77]  David H Mathews,et al.  RNA pseudoknots: folding and finding , 2010, F1000 biology reports.

[78]  R. Backofen,et al.  Computational prediction of sRNAs and their targets in bacteria , 2010 .

[79]  David H. Mathews,et al.  NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure , 2009, Nucleic Acids Res..

[80]  Rolf Backofen,et al.  Seed-based IntaRNA prediction combined with GFP-reporter system identifies mRNA targets of the small RNA Yfr1 , 2009, Bioinform..

[81]  Christian M. Reidys,et al.  Target prediction and a statistical sampling algorithm for RNA–RNA interaction , 2009, Bioinform..

[82]  T. Dalmay Detection of small non-coding RNAs. , 2010, Methods in molecular biology.

[83]  B. Simmons,et al.  A single-base resolution map of an archaeal transcriptome. , 2010, Genome research.

[84]  Peter F. Stadler,et al.  RNAz 2.0: Improved Noncoding RNA Detection , 2010, Pacific Symposium on Biocomputing.

[85]  Rolf Backofen,et al.  Fast prediction of RNA-RNA interaction , 2009, Algorithms for Molecular Biology.

[86]  J. Vogel,et al.  Deep sequencing analysis of the Methanosarcina mazei Gö1 transcriptome in response to nitrogen availability , 2009, Proceedings of the National Academy of Sciences.

[87]  Christian M. Reidys,et al.  Partition function and base pairing probabilities for RNA-RNA interaction prediction , 2009, Bioinform..

[88]  Hamidreza Chitsaz,et al.  biRNA: Fast RNA-RNA Binding Sites Prediction , 2009, WABI.

[89]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[90]  Gregor Gierga,et al.  The Yfr2 ncRNA family, a group of abundant RNA molecules widely conserved in cyanobacteria , 2009, RNA biology.

[91]  Robert Giegerich,et al.  RNA Secondary Structure Analysis Using The RNAshapes Package , 2009, Current protocols in bioinformatics.

[92]  Hamidreza Chitsaz,et al.  A partition function algorithm for interacting nucleic acid strands , 2009, Bioinform..

[93]  Gene W. Tyson,et al.  Metatranscriptomics reveals unique microbial small RNAs in the ocean’s water column , 2009, Nature.

[94]  Rolf Backofen,et al.  Lifting Prediction to Alignment of RNA Pseudoknots , 2009, RECOMB.

[95]  Cole Trapnell,et al.  How to map billions of short reads onto genomes , 2009, Nature Biotechnology.

[96]  Xiaomin Ying,et al.  sRNATarget: a web server for prediction of bacterial sRNA targets , 2009, Bioinformation.

[97]  Peter F. Stadler,et al.  Non-coding RNA annotation of the genome of Trichoplax adhaerens , 2009, Nucleic acids research.

[98]  D. Mathews,et al.  Accurate SHAPE-directed RNA structure determination , 2009, Proceedings of the National Academy of Sciences.

[99]  A. Bateman,et al.  A home for RNA families at RNA Biology , 2009 .

[100]  Eric P. Nawrocki,et al.  Structural rna homology search and alignment using covariance models , 2009 .

[101]  Hakim Tafer,et al.  RNAplex: a fast tool for RNA-RNA interaction search , 2008, Bioinform..

[102]  Sebastian Will,et al.  RNAalifold: improved consensus structure prediction for RNA alignments , 2008, BMC Bioinformatics.

[103]  Tim R. Mercer,et al.  Differentiating Protein-Coding and Noncoding RNA: Challenges and Ambiguities , 2008, PLoS Comput. Biol..

[104]  Rolf Backofen,et al.  IntaRNA: efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed regions , 2008, Bioinform..

[105]  J. Gorodkin,et al.  Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments , 2008, Nucleic acids research.

[106]  Monika J. Madej,et al.  Detection of small RNAs in Pseudomonas aeruginosa by RNomics and structure-based bioinformatic tools. , 2008, Microbiology.

[107]  M. Livny,et al.  High-Throughput, Kingdom-Wide Prediction and Annotation of Bacterial Non-Coding RNAs , 2008, PloS one.

[108]  Hua Li,et al.  Construction of two mathematical models for prediction of bacterial sRNA targets. , 2008, Biochemical and biophysical research communications.

[109]  Peter F. Stadler,et al.  Translational Control by RNA-RNA Interaction: Improved Computation of RNA-RNA Binding Thermodynamics , 2008, BIRD.

[110]  Rolf Backofen,et al.  Fixed Parameter Tractable Alignment of RNA Structures Including Arbitrary Pseudoknots , 2008, CPM.

[111]  Brian Tjaden,et al.  TargetRNA: a tool for predicting targets of small RNA action in bacteria , 2008, Nucleic Acids Res..

[112]  Rolf Backofen,et al.  Variations on RNA folding and alignment: lessons from Benasque , 2007, Journal of mathematical biology.

[113]  Stephan H. Bernhart,et al.  Strategies for measuring evolutionary conservation of RNA secondary structures , 2008, BMC Bioinformatics.

[114]  C. Lange,et al.  Experimental Characterization of Cis-Acting Elements Important for Translation and Transcription in Halophilic Archaea , 2007, PLoS genetics.

[115]  E. Rivas,et al.  Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics , 2007, Molecular microbiology.

[116]  F. Vandenesch,et al.  Staphylococcus aureus RNAIII coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism. , 2007, Genes & development.

[117]  Jan Gorodkin,et al.  Multiple structural alignment and clustering of RNA sequences , 2007, Bioinform..

[118]  Rolf Backofen,et al.  Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering , 2007, PLoS Comput. Biol..

[119]  Sonja J. Prohaska,et al.  RNAs everywhere: genome-wide annotation of structured RNAs. , 2007, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[120]  Jonathan P. Bollback,et al.  Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. , 2006, Genome research.

[121]  Gaurav Sharma,et al.  Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign , 2007, BMC Bioinformatics.

[122]  V. Kunin,et al.  Evolutionary conservation of sequence and secondary structures in CRISPR repeats , 2007, Genome Biology.

[123]  S. Salzberg,et al.  Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake , 2007, Genome Biology.

[124]  G. Storz,et al.  Target prediction for small, noncoding RNAs in bacteria , 2006, Nucleic acids research.

[125]  Peter F. Stadler,et al.  Partition function and base pairing probabilities of RNA heterodimers , 2006, Algorithms for Molecular Biology.

[126]  David Haussler,et al.  Identification and Classification of Conserved RNA Secondary Structures in the Human Genome , 2006, PLoS Comput. Biol..

[127]  Ye Ding Statistical and Bayesian approaches to RNA secondary structure prediction. , 2006, RNA.

[128]  C. Médigue,et al.  MaGe: a microbial genome annotation system supported by synteny results , 2006, Nucleic acids research.

[129]  T Höchsmann,et al.  Thermodynamic matchers: strengthening the significance of RNA folding energies. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[130]  Ivo L Hofacker,et al.  RNAs everywhere: genome-wide annotation of structured RNAs. , 2006, Genome informatics. International Conference on Genome Informatics.

[131]  Peter F. Stadler,et al.  Thermodynamics of RNA-RNA Binding , 2006, German Conference on Bioinformatics.

[132]  J. Livny,et al.  sRNAPredict: an integrative computational approach to identify sRNAs in bacterial genomes , 2005, Nucleic acids research.

[133]  Michael Zuker,et al.  DINAMelt web server for nucleic acid melting prediction , 2005, Nucleic Acids Res..

[134]  Kaizhong Zhang,et al.  RNA-RNA Interaction Prediction and Antisense RNA Target Search , 2005, RECOMB.

[135]  A. Wilm,et al.  A benchmark of multiple sequence alignment programs upon structural RNAs , 2005, Nucleic acids research.

[136]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[137]  A. Condon,et al.  Secondary structure prediction of interacting RNA molecules. , 2005, Journal of molecular biology.

[138]  Rolf Backofen,et al.  Local Sequence-structure Motifs in Rna , 2004, J. Bioinform. Comput. Biol..

[139]  R. Giegerich,et al.  Fast and effective prediction of microRNA/target duplexes. , 2004, RNA.

[140]  M. Zuker,et al.  Prediction of hybridization and melting for double-stranded nucleic acids. , 2004, Biophysical journal.

[141]  D. Pervouchine IRIS: intermolecular RNA interaction search. , 2004, Genome informatics. International Conference on Genome Informatics.

[142]  Irmtraud M. Meyer,et al.  A comparative method for finding and folding RNA secondary structures within protein-coding regions. , 2004, Nucleic acids research.

[143]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[144]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[145]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[146]  R. Giegerich,et al.  GenDB--an open source genome annotation system for prokaryote genomes. , 2003, Nucleic acids research.

[147]  C. Ehresmann,et al.  RNA loop-loop interactions as dynamic functional motifs. , 2002, Biochimie.

[148]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[149]  Elena Rivas,et al.  Noncoding RNA gene detection using comparative sequence analysis , 2001, BMC Bioinformatics.

[150]  S. Eddy,et al.  Computational identification of noncoding RNAs in E. coli by comparative genomics , 2001, Current Biology.

[151]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[152]  L. Argaman,et al.  fhlA repression by OxyS RNA: kissing complex formation at two sites results in a stable antisense-target RNA complex. , 2000, Journal of molecular biology.

[153]  J. Hartung A Note on Combining Dependent Tests of Significance , 1999 .

[154]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[155]  P. Schuster,et al.  Algorithm independent properties of RNA secondary structure predictions , 1996, European Biophysics Journal.

[156]  M. Huynen,et al.  Smoothness within ruggedness: the role of neutrality in adaptation. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[157]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[158]  P. Schuster,et al.  From sequences to shapes and back: a case study in RNA secondary structures , 1994, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[159]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[160]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[161]  M. Frohman,et al.  Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[162]  H. Domdey,et al.  Sequence and transcriptional start site of the Pseudomonas aeruginosa outer membrane porin protein F gene , 1988, Journal of bacteriology.

[163]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[164]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[165]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[166]  M. Waterman Secondary Structure of Single-Stranded Nucleic Acidst , 1978 .