The Piazza peer data management project

A major problem in today's information-driven world is that sharing heterogeneous, semantically rich data is incredibly difficult. Piazza is a peer data management system that enables sharing heterogeneous data in a distributed and scalable way. Piazza assumes the participants to be interested in sharing data, and willing to define pairwise mappings between their schemas. Then, users formulate queries over their preferred schema, and a query answering system expands recursively any mappings relevant to the query, retrieving data from other peers. In this paper, we provide a brief overview of the Piazza project including our work on developing mapping languages and query reformulation algorithms, assisting the users in defining mappings, indexing, and enforcing access control over shared data.

[1]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[2]  AnHai Doan,et al.  Corpus-based schema matching , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[4]  James A. Hendler,et al.  Web ontology language (OWL) reference version 1 , 2002 .

[5]  Renée J. Miller,et al.  Data mapping in peer-to-peer systems: Semantics and algorithmic issues , 2003, SIGMOD 2003.

[6]  Dan Suciu,et al.  Schema mediation in peer data management systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Laura M. Haas,et al.  Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.

[8]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[9]  Alon Y. Halevy,et al.  Piazza: data management infrastructure for semantic web applications , 2003, WWW '03.

[10]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[11]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[12]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[13]  Jennifer Widom,et al.  The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.

[14]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[15]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[16]  Karl Aberer,et al.  A framework for semantic gossiping , 2002, SGMD.

[17]  Dan Suciu,et al.  Controlling Access to Published Data Using Cryptography , 2003, VLDB.

[18]  Karl Aberer,et al.  The chatty web: emergent semantics through gossiping , 2003, WWW '03.

[19]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[20]  Zachary G. Ives,et al.  Integrating Network-Bound XML Data , 2001, IEEE Data Eng. Bull..

[21]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[22]  Alexander S. Szalay,et al.  SkyQuery: A Web Service Approach to Federate Databases , 2003, CIDR.

[23]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[24]  Laura M. Haas,et al.  Schema Mapping as Query Discovery , 2000, VLDB.