RNAML: a standard syntax for exchanging RNA information.

Analyzing a single data set using multiple RNA informatics programs often requires a file format conversion between each pair of programs, significantly hampering productivity. To facilitate the interoperation of these programs, we propose a syntax to exchange basic RNA molecular information. This RNAML syntax allows for the storage and the exchange of information about RNA sequence and secondary and tertiary structures. The syntax permits the description of higher level information about the data including, but not restricted to, base pairs, base triples, and pseudoknots. A class-oriented approach allows us to represent data common to a given set of RNA molecules, such as a sequence alignment and a consensus secondary structure. Documentation about experiments and computations, as well as references to journals and external databases, are included in the syntax. The chief challenge in creating such a syntax was to determine the appropriate scope of usage and to ensure extensibility as new needs will arise. The syntax complies with the eXtensible Markup Language (XML) recommendations, a widely accepted standard for syntax specifications. In addition to the various generic packages that exist to read and interpret XML formats, an XML processor was developed and put in the open-source MC-Core library for nucleic acid and protein structure computer manipulation.

[1]  G Lapalme,et al.  The combination of symbolic and numerical computation for three-dimensional modeling of RNA. , 1991, Science.

[2]  A. R. Srinivasan,et al.  The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. , 1992, Biophysical journal.

[3]  P. Limbach,et al.  Summary: the modified nucleosides of RNA. , 1994, Nucleic acids research.

[4]  Michael S. Waterman,et al.  RNA Secondary Structure , 1995 .

[5]  James W. Brown The ribonuclease P database , 1997, Nucleic Acids Res..

[6]  C Massire,et al.  MANIP: an interactive tool for modelling RNA. , 1998, Journal of molecular graphics & modelling.

[7]  James W. Brown,et al.  The Ribonuclease P Database , 1994, Nucleic Acids Res..

[8]  Russ B. Altman,et al.  Computational Modeling of Structured Experimental Data , 1999 .

[9]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[10]  David Fenyö,et al.  The Biopolymer Markup Language , 1999, Bioinform..

[11]  Robert Stevens,et al.  Wrapping and Interoperating Bioinformatics Resources Using CORBA , 2000, Briefings Bioinform..

[12]  James R. Cole,et al.  The RDP (Ribosomal Database Project) continues , 2000, Nucleic Acids Res..

[13]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[14]  Russ B. Altman,et al.  [28] Computational modeling of structural experimental data , 2000 .

[15]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[16]  P. Gendron,et al.  Quantitative analysis of nucleic acid three-dimensional structures. , 2001, Journal of molecular biology.

[17]  Emmanuel Barillot,et al.  XML, bioinformatics and data integration , 2001, Bioinform..

[18]  Rex A. Dwyer,et al.  RNA Secondary Structure , 2002 .