Piazza: mediation and integration infrastructure for Semantic Web data

The SemanticWeb envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new languages for specifying meanings for concepts and developed techniques for reasoning about them, using RDF as the data model. To flourish, the Semantic Web needs to provide interoperability both between sites with different terminologies and with existing data and the applications operating on them. To achieve this, we are faced with two problems. First, most of the world's data is available not in RDF but in XML; XML and the applications consuming it rely not only on the domain structure of the data, but also on its document structure. Hence, to provide interoperability between such sources, we must map between both their domain structures and their document structures. Second, data management practitioners often prefer to exchange data through local point-to-point data translations, rather than mapping to common mediated schemas or ontologies. This paper describes the system, which addresses these challenges. Piazza offers a language for mediating between data sources on the SemanticWeb, and it maps both the domain structure and document structure. Piazza also enables interoperation of XML data with RDF data that is accompanied by rich OWL ontologies. Mappings in Piazza are provided at a local scale between small sets of nodes, and our query answering algorithm is able to chain sets mappings together to obtain relevant data from across the Piazza network. We also describe an implemented scenario in Piazza and the lessons we learned from it.

[1]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[2]  Rajeev Motwani,et al.  Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.

[3]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[4]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[5]  Catriel Beeri,et al.  Ontology-Based Integration of XML Web Resources , 2002, SEMWEB.

[6]  Alon Y. Halevy,et al.  An XML query engine for network-bound data , 2002, The VLDB Journal.

[7]  Tova Milo,et al.  Views in a large-scale XML repository , 2002, The VLDB Journal.

[8]  Vipul Kashyap,et al.  OBSERVER: An Approach for Query Processing in Global Information Systems Based on Interoperation Across Pre-Existing Ontologies , 2000, Distributed and Parallel Databases.

[9]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[10]  Trevor J. M. Bench-Capon,et al.  Kraft: An Agent Architecture for Knowledge Fusion , 2001, Int. J. Cooperative Inf. Syst..

[11]  Vipul Kashyap,et al.  Imprecise Answers in Distributed Environments: Estimation of Information Loss for Multi-Ontology Based Query Processing , 2000, Int. J. Cooperative Inf. Syst..

[12]  Maurizio Lenzerini,et al.  Source inconsistency and incompleteness in data integration , 2002, KRDB.

[13]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[14]  James A. Hendler,et al.  The semantic grid: The grid meets the semantic web , 2002 .

[15]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[16]  Ian Horrocks,et al.  Metamodeling Architecture of Web Ontology Languages , 2001, SWWS.

[17]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[18]  Craig A. Knoblock,et al.  Learning object identification rules for information integration , 2001, Inf. Syst..

[19]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[20]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[21]  Mark A. Musen,et al.  PROMPT: Algorithm and tool for ontology merging and alignment , 2000, AAAI 2000.

[22]  Diego Calvanese,et al.  Answering Queries Using Views in Description Logics , 1999, KRDB.

[23]  Renée J. Miller,et al.  Mapping data in peer-to-peer systems: semantics and algorithmic issues , 2003, SIGMOD '03.

[24]  Stuart J. Russell,et al.  Approximate inference for first-order probabilistic languages , 2001, IJCAI.

[25]  Tore Risch,et al.  EDUTELLA: a P2P networking infrastructure based on RDF , 2002, WWW.

[26]  Dan Suciu,et al.  SilkRoute: trading between relations and XML , 2000, Comput. Networks.

[27]  Surajit Chaudhuri,et al.  Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.

[28]  Peter F. Patel-Schneider,et al.  Building the Semantic Web on XML , 2002, SEMWEB.

[29]  Oren Etzioni,et al.  Crossing the Structure Chasm , 2003, CIDR.

[30]  Deborah L. McGuinness,et al.  The Chimaera Ontology Environment , 2000, AAAI/IAAI.

[31]  Michael Rys,et al.  Bringing the Internet to Your Database: Using SQLServer 2000 and XML to Build Loosely-Coupled Systems , 2001, BTW.

[32]  James A. Hendler,et al.  Owl web ontology language 1 , 2002 .

[33]  Catriel Beeri,et al.  Rewriting queries using views in description logics , 1997, PODS '97.

[34]  Alon Y. Halevy,et al.  Combining Horn Rules and Description Logics in CARIN , 1998, Artif. Intell..

[35]  Beng Chin Ooi,et al.  PeerDB: a P2P-based system for distributed data sharing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[36]  Karl Aberer,et al.  The chatty web: emergent semantics through gossiping , 2003, WWW '03.

[37]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[38]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[39]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[40]  Paul Westerman Data Warehousing: Using the Wal-Mart Model , 2000 .