An Intelligent SPARQL Query Builder for Exploration of Various Life-science Databases

Database integration of a wide variety of life-science data is an important issue for comprehensive data analysis. Since Semantic Web technologies, such as Resource Description Framework (RDF), are expected to provide efficient data integration technologies, many life-science databases are published in RDF with SPARQL Protocol and RDF Query Language (SPARQL) endpoints as search application programming interfaces on the web. However, although SPARQL supports very useful functions for exploring and integrating various datasets, many biologists find SPARQL difficult to use. To overcome this problem, we propose an intelligent SPARQL query builder that aids users with no knowledge on SPARQL in building queries. This paper discusses the methods used by this tool, its system design and implementation. The tool assists users in generating queries for cross-database annotations based on RDF and enhances the value of life-science data by exploring and integrating these data.

[1]  Monica M. C. Schraefel,et al.  Connecting the Dots: A Multi-pivot Approach to Data Exploration , 2011, SEMWEB.

[2]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[3]  Atsuko Yamaguchi,et al.  TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model , 2014, Nucleic Acids Res..

[4]  Tetsuro Toyoda,et al.  BioSPARQL: ontology-based smart building of SPARQL queries for biological linked open data , 2011, SWAT4LS.

[5]  Uzay Kaymak,et al.  RDF-GL: A SPARQL-Based Graphical Query Language for RDF , 2010, Emergent Web Intelligence.

[6]  Sébastien Ferré,et al.  Reconciling faceted search and query languages for the Semantic Web , 2012, Int. J. Metadata Semant. Ontologies.

[7]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[8]  Riichiro Mizoguchi,et al.  Understanding an Ontology through Divergent Exploration , 2011, ESWC.

[9]  Sebastian Rudolph,et al.  Managing Structured and Semistructured RDF Data Using Structure Indexes , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Nigel Shadbolt,et al.  NITELIGHT: A Graphical Tool for Semantic Query Construction , 2008 .

[11]  Feifei Li,et al.  Scalable Keyword Search on Large RDF Data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[12]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[13]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.