RightField: Semantic enrichment of Systems Biology data using spreadsheets

The interpretation and integration of experimental data depends on consistent metadata and uniform annotation. However, there are many barriers to the acquisition of this rich semantic metadata, not least the overhead and complexity of its collection by scientists. We present RightField, a lightweight spreadsheet-based annotation tool for lowering the barrier of manual metadata acquisition; and a data integration application for extracting and querying RDF data from these enriched spreadsheets. By hiding the complexities of semantic annotation, we can improve the collection of rich metadata, at source, by scientists. We illustrate the approach with results from the SysMO program, showing that RightField supports the whole workflow of semantic data collection, submission and RDF querying in Systems Biology. The RightField tool is freely available from http://www.rightfield.org.uk, and the code is open source under the BSD License.

[1]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[2]  Daniel L. Rubin,et al.  Biomedical ontologies: a functional perspective , 2007, Briefings Bioinform..

[3]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[4]  Chris F. Taylor,et al.  Meeting Report: BioSharing at ISMB 2010 , 2010, Standards in genomic sciences.

[5]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[6]  Helen E. Parkinson,et al.  ArrayExpress—a public database of microarray experiments and gene expression profiles , 2006, Nucleic Acids Res..

[7]  Bryn Nelson Data sharing: Empty archives , 2009, Nature.

[8]  Martin Eisenacher,et al.  Enabling BioSharing – a report on the Annual Spring Workshop of the HUPO‐PSI April 11–13, 2011, EMBL‐Heidelberg, Germany , 2011, Proteomics.

[9]  R. Kandpal,et al.  The era of 'omics unlimited. , 2009, BioTechniques.

[10]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[11]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[12]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[13]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[14]  Alvis Brazma,et al.  MGED standards: work in progress. , 2006, Omics : a journal of integrative biology.

[15]  Carole Goble,et al.  The SEEK: a platform for sharing data and models in systems biology. , 2011, Methods in enzymology.

[16]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[17]  Bernard De Baets,et al.  BioGateway: a semantic systems biology tool for the life sciences , 2009, BMC Bioinformatics.

[18]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[19]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[20]  Kirill Degtyarenko,et al.  ChEBI: An Open Bioinformatics and Cheminformatics Resource , 2009, Current protocols in bioinformatics.