Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration

BackgroundScientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap") offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS]) used SSWAP to semantically describe selected data and web services.MethodsWe selected high-priority Quantitative Trait Locus (QTL), genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL), Resource Description Framework (RDF) and eXtensible Markup Language (XML) documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info.ResultsA total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST).ConclusionsThe need for semantic integration technologies has preceded available solutions. We report the feasibility of mapping high priority data from local, independent, idiosyncratic data schemas to common shared concepts as implemented in web-accessible ontologies. These mappings are then amenable for use in semantic web services. Our implementation of approximately two dozen services means that biological data at three large information resources (Gramene, SoyBase, and LIS) is available for programmatic access, semantic searching, and enhanced interaction between the separate missions of these resources.

[1]  David Wheeler,et al.  Building Customized Data Pipelines Using the Entrez Programming Utilities (eUtils) , 2004 .

[2]  Koustuv Dasgupta,et al.  Adaptation inWeb Service Composition and Execution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[3]  Heiko Schoof,et al.  BioMOBY Successfully Integrates Distributed Heterogeneous Bioinformatics Web Services. The PlaNet Exemplar Case1 , 2005, Plant Physiology.

[4]  Sameer Velankar,et al.  SOAP-based services provided by the European Bioinformatics Institute , 2005, Nucleic Acids Res..

[5]  Ricardo Jiménez-Peris,et al.  Decentralized web service orchestration: a reflective approach , 2008, SAC '08.

[6]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[7]  Edward Benson,et al.  Bridging the semantic Web and Web 2.0 with Representational State Transfer (REST) , 2008, J. Web Semant..

[8]  Carole Goble,et al.  Curating Scientific Web Services and Workflows , 2008 .

[9]  Lincoln Stein,et al.  The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations , 2008, Nucleic Acids Res..

[10]  B. Kirchoff,et al.  Plant structure ontology: How should we label plant structures with doubtful or mixed identities? * , 2008 .

[11]  I. Melzer Web Services Description Language , 2010 .

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Rodrigo Lopez,et al.  Web services at the European Bioinformatics Institute-2009 , 2009, Nucleic Acids Res..

[14]  Rodrigo Lopez,et al.  Web Services at the European Bioinformatics Institute , 2007, Nucleic Acids Res..

[15]  Jeff Z. Pan,et al.  Resource Description Framework , 2020, Definitions.

[16]  Eric van der Vlist,et al.  XML Schema , 2002 .

[17]  David M. Grant,et al.  The Legume Information System (LIS): an integrated information resource for comparative legume biology , 2004, Nucleic Acids Res..

[18]  Lincoln Stein,et al.  Gramene: a growing plant comparative genomics resource , 2007, Nucleic Acids Res..

[19]  L. Stein,et al.  Gramene: Development and Integration of Trait and Gene Ontologies for Rice , 2002, Comparative and functional genomics.

[20]  Christopher D. Town,et al.  SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services , 2009, BMC Bioinformatics.

[21]  Koustuv Dasgupta,et al.  Synthy: A system for end to end composition of web services , 2005, J. Web Semant..