Privacy in GLAV Information Integration

We define and study formal privacy guarantees for information integration systems, where sources are related to a public schema by mappings given by source-to-target dependencies which express inclusion of unions of conjunctive queries with equality. This generalizes previous privacy work in the global-as-view publishing scenario and covers local-as-view as well as combinations of the two. We concentrate on logical security, where malicious users have the same level of access as legitimate users: they can issue queries against the global schema which are answered under “certain answers” semantics and then use unlimited computational power and external knowledge on the results of the queries to guess the result of a secret query (“the secret”) on one or more of the sources, which are not directly accessible. We do not address issues of physical security, which include how to prevent users from gaining unauthorized access to the data. We define both absolute guarantees: how safe is the secret? and relative guarantees: how much of the secret is additionally disclosed when the mapping is extended, for example to allow new data sources or new relationships between an existing data source and the global schema? We provide algorithms for checking whether these guarantees hold and undecidability results for related, stronger guarantees.

[1]  Cong Yu,et al.  Constraint-based XML query rewriting for data integration , 2004, SIGMOD '04.

[2]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2005, Theor. Comput. Sci..

[3]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[4]  Serge Abiteboul,et al.  On the representation and querying of sets of possible worlds , 1987, SIGMOD '87.

[5]  Andrea Calì,et al.  Data integration under integrity constraints , 2004, Inf. Syst..

[6]  Victor Vianu,et al.  Views and queries: Determinacy and rewriting , 2010, TODS.

[7]  Todd D. Millstein,et al.  Navigational Plans For Data Integration , 1999, AAAI/IAAI.

[8]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[9]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..

[10]  Dan Suciu,et al.  Asymptotic Conditional Probabilities for Conjunctive Queries , 2005, ICDT.

[11]  Andrea Calì,et al.  Models for Information Integration: Turning Local-as-View Into Global-as-View , 2001 .

[12]  Christoph Koch,et al.  Query rewriting with symmetric constraints , 2002, AI Commun..

[13]  Alin Deutsch,et al.  Privacy in Database Publishing , 2005, ICDT.

[14]  Alberto O. Mendelzon,et al.  Tableau Techniques for Querying Information Sources through Global Schemas , 1999, ICDT.

[15]  Dan Suciu,et al.  A formal analysis of information disclosure in data exchange , 2004, SIGMOD '04.

[16]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[17]  Dan Suciu,et al.  Answering Queries from Statistics and Probabilistic Views , 2005, VLDB.

[18]  S. Sudarshan,et al.  Extending query rewriting techniques for fine-grained access control , 2004, SIGMOD '04.

[19]  Jan Van den Bussche,et al.  Database Interrogation Using Conjunctive Queries , 2003, ICDT.

[20]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[21]  Dan Suciu,et al.  Controlling Access to Published Data Using Cryptography , 2003, VLDB.

[22]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..