StdTrip: An a Priori Design Approach and Process for Publishing Open Government Data

Open Government Data (OGD) consists in the publication of information produced, archived and distributed by public organizations in formats that allow it to be shared, discovered, accessed and easily manipulated by third party consumers. This approach requires the triplification of datasets, i.e., the conversion of database schemas and their instances to a set of RDF triples. A key issue in this process is deciding how to represent database schema concepts in terms of RDF classes and properties. This is done by mapping database concepts to an RDF vocabulary, used as the base for generating the triples. The construction of this vocabulary is extremely important, because the more standards are reused, the easier it will be to interlink the result to existing datasets. However, the tools available today do not support reuse of standard vocabularies in the triplification process, but rather they create new vocabularies. In this paper, we present the StdTrip process that guides users in the triplification process, while promoting the reuse of standard, W3C recommended, RDF vocabularies in the first place and, if not possible, by suggesting the reuse of other vocabularies already employed by other RDF datasets on the Web.

[1]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2]  Antonio L. Furtado,et al.  A Strategy to Revise the Constraints of the Mediated Schema , 2009, ER.

[3]  John G. Breslin,et al.  Social Semantic Web , 2009, Handbook of Semantic Web Technologies.

[4]  Mark B. Sandler,et al.  Automatic Interlinking of Music Datasets on the Semantic Web , 2008, LDOW.

[5]  Marco A. Casanova,et al.  Mapping Uninterpreted Schemes into Entity-Relationship Diagrams: Two Applications to Conceptual Schema Design , 1984, IBM J. Res. Dev..

[6]  Andreas Harth,et al.  An Interactive Map of Semantic Web Ontology Usage , 2008, 2008 12th International Conference Information Visualisation.

[7]  Sonia Bergamaschi,et al.  Schema Normalization for Improving Schema Matching , 2009, ER.

[8]  Antonio L. Furtado,et al.  OWL schema matching , 2010, Journal of the Brazilian Computer Society.

[9]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative , 2007 .

[10]  Marco A. Casanova,et al.  Semantic Web: Concepts, Technologies and Applications , 2007, NASA Monographs in Systems and Software Engineering.

[11]  Jens Lehmann,et al.  Triplify: light-weight linked data publication from relational databases , 2009, WWW '09.

[12]  Elena Console,et al.  Data Fusion , 2009, Encyclopedia of Database Systems.

[13]  Marco A. Casanova,et al.  Database Conceptual Schema Matching , 2007, Computer.

[14]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[15]  Antonio L. Furtado,et al.  W-Ray: A Strategy to Publish Deep Web Geographic Data , 2009, ER Workshops.

[16]  Michael Hausenblas,et al.  Interlinking of Resources with Semantics , 2008, ESWC.

[17]  Marco A. Casanova,et al.  Interoperability by design using the StdTrip tool: an a priori approach , 2010, I-SEMANTICS '10.

[18]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.