Translating standards into practice - One Semantic Web API for Gene Expression

Sharing and describing experimental results unambiguously with sufficient detail to enable replication of results is a fundamental tenet of scientific research. In today's cluttered world of "-omics" sciences, data standards and standardized use of terminologies and ontologies for biomedical informatics play an important role in reporting high-throughput experiment results in formats that can be interpreted by both researchers and analytical tools. Increasing adoption of Semantic Web and Linked Data technologies for the integration of heterogeneous and distributed health care and life sciences (HCLSs) datasets has made the reuse of standards even more pressing; dynamic semantic query federation can be used for integrative bioinformatics when ontologies and identifiers are reused across data instances. We present here a methodology to integrate the results and experimental context of three different representations of microarray-based transcriptomic experiments: the Gene Expression Atlas, the W3C BioRDF task force approach to reporting Provenance of Microarray Experiments, and the HSCI blood genomics project. Our approach does not attempt to improve the expressivity of existing standards for genomics but, instead, to enable integration of existing datasets published from microarray-based transcriptomic experiments. SPARQL Construct is used to create a posteriori mappings of concepts and properties and linking rules that match entities based on query constraints. We discuss how our integrative approach can encourage reuse of the Experimental Factor Ontology (EFO) and the Ontology for Biomedical Investigations (OBIs) for the reporting of experimental context and results of gene expression studies.

[1]  Kazuho Ikeo,et al.  CIBEX: center for information biology gene expression database. , 2003, Comptes rendus biologies.

[2]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[3]  Daniel L. Rubin,et al.  Annotation and query of tissue microarray data using the NCI Thesaurus , 2007, BMC Bioinformatics.

[4]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[5]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[6]  Alberto Cambrosio,et al.  Making a New Technology Work: The Standardization and Regulation of Microarrays , 2007, The Yale journal of biology and medicine.

[7]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[8]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[9]  Ela Hunt,et al.  Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies , 2009, BMC Bioinformatics.

[10]  A. Brazma,et al.  Standards for systems biology , 2006, Nature Reviews Genetics.

[11]  Rami Rifaieh,et al.  Wrestling with SUMO and bio-ontologies , 2006, Nature Biotechnology.

[12]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[13]  Alexander Amberg,et al.  The enhanced value of combining conventional and "omics" analyses in early assessment of drug-induced hepatobiliary injury. , 2011, Toxicology and applied pharmacology.

[14]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[15]  M. Scott Marshall,et al.  A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data , 2007, Bioinform..

[16]  Robert Stevens,et al.  Wrestling with SUMO and bio-ontologies , 2006, Nature Biotechnology.

[17]  Florian Weighardt European GMO labeling thresholds impractical and unscientific , 2006, Nature Biotechnology.

[18]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[19]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[20]  Kei-Hoi Cheung,et al.  Advancing translational research with the Semantic Web , 2007, BMC Bioinformatics.

[21]  Arthur L. Beaudet,et al.  Which way for genetic-test regulation? Leave test interpretation to specialists , 2010, Nature.

[22]  Antoine Isaac,et al.  SKOS Simple Knowledge Organization System Primer , 2009 .

[23]  Ibrahim Emam,et al.  Gene Expression Atlas at the European Bioinformatics Institute , 2009, Nucleic Acids Res..

[24]  Alexander Amberg,et al.  Cross-study and cross-omics comparisons of three nephrotoxic compounds reveal mechanistic insights and new candidate biomarkers. , 2011, Toxicology and applied pharmacology.

[25]  Ravi Shankar,et al.  Annotare—a tool for annotating high-throughput biomedical investigations and resulting data , 2010, Bioinform..

[26]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[27]  Alexander Amberg,et al.  A comparative integrated transcript analysis and functional characterization of differential mechanisms for induction of liver hypertrophy in the rat. , 2011, Toxicology and applied pharmacology.

[28]  John N. Weinstein,et al.  Exposing the cancer genome atlas as a SPARQL endpoint , 2010, J. Biomed. Informatics.

[29]  R. Hoffmann A wiki for the life sciences where authorship matters , 2008, Nature Genetics.

[30]  Kei-Hoi Cheung,et al.  Linking Open Drug Data , 2009, I-SEMANTICS.

[31]  Alan Ruttenberg,et al.  The OWL of Biomedical Investigations , 2008, OWLED.

[32]  Sean Bechhofer,et al.  SKOS Simple Knowledge Organization System Reference , 2009 .

[33]  Paul T. Spellman,et al.  A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB , 2006, BMC Bioinformatics.

[34]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.

[35]  Jun Zhao,et al.  Describing Linked Datasets On the Design and Usage of voiD, the "Vocabulary Of Interlinked Datasets" , 2009 .

[36]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[37]  M. Relling,et al.  Pharmacogenomics: translating functional genomics into rational therapeutics. , 1999, Science.

[38]  Ross D King,et al.  Are the current ontologies in biology good ontologies? , 2005, Nature Biotechnology.

[39]  Natalya F. Noy,et al.  BioPortal: Ontologies and Integrated Data Resources at the Click of a Mouse , 2009 .

[40]  M. Scott Marshall,et al.  Provenance of Microarray Experiments for a Better Understanding of Experiment Results , 2010, SWPM@ISWC.