Rfam 12.0: updates to the RNA families database

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.

[1]  Sean R. Eddy,et al.  Computational identification of functional RNA homologs in metagenomic data , 2013, RNA biology.

[2]  Eric P. Nawrocki,et al.  Annotating functional RNAs in genomes using Infernal. , 2014, Methods in molecular biology.

[3]  Robert D. Finn,et al.  Rfam: Wikipedia, clans and the “decimal” release , 2010, Nucleic Acids Res..

[4]  Jan Gorodkin,et al.  Structured RNAs and synteny regions in the pig genome , 2014, BMC Genomics.

[5]  Justin T. Roberts,et al.  Burgeoning evidence indicates that microRNAs were initially formed from transposable element sequences , 2014, Mobile genetic elements.

[6]  Lisa Maria Mustachio,et al.  The Vibrio cholerae Mannitol Transporter Is Regulated Posttranscriptionally by the MtlS Small Regulatory RNA , 2011, Journal of bacteriology.

[7]  E. Wagner,et al.  A Repeated GGA Motif Is Critical for the Activity and Stability of the Riboregulator RsmY of Pseudomonas fluorescens* , 2004, Journal of Biological Chemistry.

[8]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[9]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[10]  D Gautheret,et al.  A major family of motifs involving G.A mismatches in ribosomal RNA. , 1994, Journal of molecular biology.

[11]  Matthew Fraser,et al.  EBI metagenomics—a new resource for the analysis and archiving of metagenomic data , 2013, Nucleic Acids Res..

[12]  D A Kramerov,et al.  Origin and evolution of SINEs in eukaryotic genomes , 2011, Heredity.

[13]  D. Lilley,et al.  A structural database for k-turn motifs in RNA. , 2010, RNA.

[14]  Jan Gorodkin,et al.  RNA Sequence, Structure, and Function: Computational and Bioinformatic Methods , 2014, Methods in Molecular Biology.

[15]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[16]  Jamie J. Cannone,et al.  Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[17]  D. Giedroc,et al.  The RNA Molecule CsrB Binds to the Global Regulatory Protein CsrA and Antagonizes Its Activity in Escherichia coli * , 1997, The Journal of Biological Chemistry.

[18]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[19]  A. Weiner,et al.  Human genes and pseudogenes for the 7SL RNA component of signal recognition particle. , 1984, The EMBO journal.

[20]  Sean R. Eddy,et al.  Local RNA structure alignment with incomplete sequence , 2009, Bioinform..

[21]  P. Moore,et al.  The sarcin/ricin loop, a modular RNA. , 1995, Journal of molecular biology.

[22]  Sam Griffiths-Jones,et al.  Annotating noncoding RNA genes. , 2007, Annual review of genomics and human genetics.

[23]  J. Mattick,et al.  Long non-coding RNAs: insights into functions , 2009, Nature Reviews Genetics.

[24]  Fredrik H. Karlsson,et al.  Gut metagenome in European women with normal, impaired and diabetic glucose control , 2013, Nature.

[25]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[26]  Yu Zhang,et al.  Miniature Inverted–Repeat Transposable Elements (MITEs) Have Been Accumulated through Amplification Bursts and Play Important Roles in Gene Expression and Species Diversity in Oryza sativa , 2011, Molecular biology and evolution.

[27]  L. Van Melderen,et al.  Post-transcriptional global regulation by CsrA in bacteria , 2010, Cellular and Molecular Life Sciences.

[28]  J. Tate,et al.  The RNA WikiProject: community annotation of RNA families. , 2008, RNA.

[29]  Xiu Lin,et al.  Facing growth in the European Nucleotide Archive , 2012, Nucleic Acids Res..

[30]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[31]  Dan M. Bolser,et al.  Ensembl Genomes 2013: scaling up access to genome-wide data , 2013, Nucleic Acids Res..

[32]  Sean R. Eddy,et al.  nhmmer: DNA homology search with profile HMMs , 2013, Bioinform..

[33]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[34]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[35]  M. Rasis,et al.  The LetA‐RsmYZ‐CsrA regulatory cascade, together with RpoS and PmrA, post‐transcriptionally regulates stationary phase activation of Legionella pneumophila Icm/Dot effectors , 2009, Molecular microbiology.

[36]  Jonathan P. Bollback,et al.  Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. , 2006, Genome research.

[37]  C. Pichon,et al.  The AfaR small RNA controls expression of the AfaD-VIII invasin in pathogenic Escherichia coli strains , 2013, Nucleic acids research.