The tedious task of finding homologous noncoding RNA genes.

User-driven in silico RNA homology search is still a nontrivial task. In part, this is the consequence of a limited precision of the computational tools in spite of recent exciting progress in this area, and to a certain extent, computational costs are still problematic in practice. An important, and as we argue here, dominating issue is the dependence on good curated (secondary) structural alignments of the RNAs. These are often hard to obtain, not so much because of an inherent limitation in the available data, but because they require substantial manual curation, an effort that is rarely acknowledged. Here, we qualitatively describe a realistic scenario for what a "regular user" (i.e., a nonexpert in a particular RNA family) can do in practice, and what kind of results are likely to be achieved. Despite the indisputable advances in computational RNA biology, the conclusion is discouraging: BLAST still works better or equally good as other methods unless extensive expert knowledge on the RNA family is included. However, when good curated data are available the recent development yields further improvements in finding remote homologs. Homology search beyond the reach of BLAST hence is not at all a routine task.

[1]  Sonja J. Prohaska,et al.  Evolution of vault RNAs. , 2009, Molecular biology and evolution.

[2]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[3]  Jean-François Lucier,et al.  Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements. , 2009, RNA.

[4]  Peter F. Stadler,et al.  Non-coding RNA annotation of the genome of Trichoplax adhaerens , 2009, Nucleic acids research.

[5]  A. Bateman,et al.  A home for RNA families at RNA Biology , 2009 .

[6]  P. Stadler,et al.  A survey of nematode SmY RNAs , 2009, RNA biology.

[7]  Cody W. Geary,et al.  The UA_handle: a versatile submotif in stable RNA architectures† , 2008, Nucleic acids research.

[8]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[9]  Robert D. Finn,et al.  Rfam: updates to the RNA families database , 2008, Nucleic Acids Res..

[10]  Declan Butler,et al.  Publish in Wikipedia or perish , 2008 .

[11]  Toralf Kirsten,et al.  Evolution of Spliceosomal snRNA Genes in Metazoan Animals , 2008, Journal of Molecular Evolution.

[12]  P. Stadler,et al.  Arthropod 7SK RNA. , 2008, Molecular biology and evolution.

[13]  Marcela Dávila López,et al.  Computational screen for spliceosomal RNA genes aids in defining the phylogenetic distribution of major and minor spliceosomal components , 2008, Nucleic acids research.

[14]  W. L. Ruzzo,et al.  Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions. , 2008, Genome research.

[15]  Axel Mosig,et al.  Structure and Function of the Smallest Vertebrate Telomerase RNA from Teleost Fish* , 2008, Journal of Biological Chemistry.

[16]  David Haussler,et al.  The UCSC Genome Browser Database: 2008 update , 2007, Nucleic Acids Res..

[17]  Peter F. Stadler,et al.  U7 snRNAs: A Computational Survey , 2008, Genom. Proteom. Bioinform..

[18]  P. Stadler,et al.  Invertebrate 7SK snRNAs , 2008, Journal of Molecular Evolution.

[19]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[20]  Sonja J. Prohaska,et al.  Computational RNomics of Drosophilids , 2007, BMC Genomics.

[21]  Peter Sestoft,et al.  Semiautomated improvement of RNA alignments. , 2007, RNA.

[22]  Jan Gorodkin,et al.  Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix , 2007, PLoS Comput. Biol..

[23]  Robert Giegerich,et al.  Locomotif: from graphical motif description to RNA motif search , 2007, ISMB/ECCB.

[24]  M. Gerstein,et al.  Structured Rnas in the Encode Selected Regions of the Human Genome , 2022 .

[25]  Alexander P. Gultyaev,et al.  Identification of conserved secondary structures and expansion segments in enod40 RNAs reveals new enod40 homologues in plants , 2007, Nucleic acids research.

[26]  Peter F. Stadler,et al.  Evolution of the vertebrate Y RNA cluster , 2007, Theory in Biosciences.

[27]  Jan Gorodkin,et al.  Multiple structural alignment and clustering of RNA sequences , 2007, Bioinform..

[28]  Rolf Backofen,et al.  Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering , 2007, PLoS Comput. Biol..

[29]  P. Stadler,et al.  RNase MRP and the RNA processing cascade in the eukaryotic ancestor , 2007, BMC Evolutionary Biology.

[30]  Sean R. Eddy,et al.  Query-Dependent Banding (QDB) for Faster RNA Similarity Searches , 2007, PLoS Comput. Biol..

[31]  Jonathan P. Bollback,et al.  Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. , 2006, Genome research.

[32]  Gaurav Sharma,et al.  Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign , 2007, BMC Bioinformatics.

[33]  E. Westhof,et al.  The building blocks and motifs of RNA architecture. , 2006, Current opinion in structural biology.

[34]  David Haussler,et al.  Identification and Classification of Conserved RNA Secondary Structures in the Human Genome , 2006, PLoS Comput. Biol..

[35]  Zasha Weinberg,et al.  CMfinder - a covariance model based RNA motif finding algorithm , 2006, Bioinform..

[36]  Christian Zwieb,et al.  The tmRDB and SRPDB resources , 2005, Nucleic Acids Res..

[37]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[38]  Zasha Weinberg,et al.  Sequence-based heuristics for faster annotation of non-coding RNA families , 2006, Bioinform..

[39]  Tore Samuelsson,et al.  Identification and analysis of ribonuclease P and MRP RNA in a broad range of eukaryotes , 2005, Nucleic acids research.

[40]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[41]  E. Westhof,et al.  A surprisingly large RNase P RNA in Candida glabrata. , 2005, RNA.

[42]  Sean R Eddy,et al.  C. elegans noncoding RNA genes. , 2005, WormBook : the online review of C. elegans biology.

[43]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[44]  Peter F. Stadler,et al.  Non-coding RNAs in Ciona intestinalis , 2005, ECCB/JBI.

[45]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[46]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[47]  E. Blackburn,et al.  A novel pseudoknot element is essential for the action of a yeast telomerase. , 2003, Genes & development.

[48]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[49]  E. Westhof,et al.  Analysis of RNA motifs. , 2003, Current opinion in structural biology.

[50]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[51]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[52]  Dave Cross,et al.  The Building Blocks , 2002 .

[53]  D. Ecker,et al.  RNAMotif, an RNA secondary structure definition and search algorithm. , 2001, Nucleic acids research.

[54]  D. Gautheret,et al.  Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. , 2001, Journal of molecular biology.

[55]  Elena Rivas,et al.  Noncoding RNA gene detection using comparative sequence analysis , 2001, BMC Bioinformatics.

[56]  T. Steitz,et al.  The kink‐turn: a new RNA secondary structure motif , 2001, The EMBO journal.

[57]  Jiunn-Liang Chen,et al.  Secondary Structure of Vertebrate Telomerase RNA , 2000, Cell.

[58]  Bjarne Knudsen,et al.  RNA secondary structure prediction using stochastic context-free grammars and evolutionary history , 1999, Bioinform..

[59]  C. Zwieb,et al.  Comparative sequence analysis of tmRNA. , 1999, Nucleic acids research.

[60]  R. Overbeek,et al.  Searching for patterns in genomic data. , 1997, Trends in genetics : TIG.

[61]  A. Viari,et al.  Palingol: a declarative programming language to describe nucleic acids' secondary structures and to scan sequence database. , 1996, Nucleic acids research.

[62]  Michael S. Waterman,et al.  RNA Secondary Structure , 1995 .

[63]  Daniel Gautheret,et al.  Pattern searching/alignment with RNA primary and secondary structures: an effective descriptor for tRNA , 1990, Comput. Appl. Biosci..

[64]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .