RNA structure alignment on a massively parallel computer

An efficient implementation of the COVE software suite on the MasPar architecture makes the use of COVE practical for scanning large bioinformatics databases for new genes belonging to known RNA families. Using a covariance model, database sequences are scanned for subsequences that match the consensus RNA secondary structure and sequence of a given family. Massive parallelism has been achieved by analysing database sequences in parallel over the rows of the MasPar processor array and mapping the three-dimensional dynamic scoring programming matrix onto each row. We have carried out scans of GenBank and nematode genome sequence for tRNA and SRP-RNAs with successful results. The parallel implementation achieves speedups of two orders of magnitude, bringing database search times down from months to days. We think this is a useful advance in the analysis of RNA. We hope to reduce the time taken still further by algorithmic and implementation optimisations so that COVE can be widely and easily used.

[1]  C. Burks,et al.  Identifying potential tRNA genes in genomic DNA sequences. , 1991, Journal of molecular biology.

[2]  D. Haussler,et al.  Stochastic context-free grammars for modeling RNA , 1993, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[3]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[4]  P. L. Deininger,et al.  SINEs: Short interspersed repeated DNA elements in higher eucaryotes. , 1989 .

[5]  B. Ganem RNA world , 1987, Nature.

[6]  David B. Searls,et al.  The Linguistics of DNA , 1992 .