Efficient secure query evaluation over encrypted XML databases

Motivated by the "database-as-service" paradigm wherein data owned by a client is hosted on a third-party server, there is significant interest in secure query evaluation over encrypted databases. We consider this problem for XML databases. We consider an attack model where the attacker may possess exact knowledge about the domain values and their occurrence frequencies, and we wish to protect sensitive structural information as well as value associations. We capture such security requirements using a novel notion of security constraints. For security reasons, sensitive parts of the hosted database are encrypted. There is a tension between data security and efficiency of query evaluation for different granularities of encryption. We show that finding an optimal, secure encryption scheme is NP-hard. For speeding up query processing, we propose to keep metadata, consisting of structure and value indices, on the server. We want to prevent the server, or an attacker who gains access to the server, from learning sensitive information in the database. We propose security properties for such a hosted XML database system to satisfy and prove that our proposal satisfies these properties. Intuitively, this means the attacker cannot improve his prior belief probability distribution about which candidate database led to the given encrypted database, by looking at the encrypted database as well as the metadata. We also prove that by observing a series of queries and their answers, the attacker cannot improve his prior belief probability distribution over which sensitive queries (structural or value associations) hold in the hosted database. Finally, we demonstrate with a detailed set of experiments that our techniques enable efficient query processing while satisfying the security properties defined in the paper.

[1]  Rajeev Motwani,et al.  Two Can Keep A Secret: A Distributed Architecture for Secure Database Services , 2005, CIDR.

[2]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[3]  Yan-Cheng Chang,et al.  Single Database Private Information Retrieval with Logarithmic Communication , 2004, ACISP.

[4]  Claude E. Shannon,et al.  Communication theory of secrecy systems , 1949, Bell Syst. Tech. J..

[5]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[6]  K. Selçuk Candan,et al.  Secure and Privacy Preserving Outsourcing of Tree Structured Data , 2004, Secure Data Management.

[7]  Dorothy E. Denning,et al.  Cryptography and Data Security , 1982 .

[8]  Dan Suciu,et al.  Controlling Access to Published Data Using Cryptography , 2003, VLDB.

[9]  Gene Tsudik,et al.  A Privacy-Preserving Index for Range Queries , 2004, VLDB.

[10]  Kenneth L. Clarkson,et al.  A Modification of the Greedy Algorithm for Vertex Cover , 1983, Inf. Process. Lett..

[11]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[12]  Martín Abadi,et al.  Security analysis of cryptographically controlled access to XML documents , 2005, PODS '05.

[13]  Elisa Bertino,et al.  Secure and selective dissemination of XML documents , 2002, TSEC.

[14]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[15]  Béatrice Finance,et al.  The case for access control on XML relationships , 2005, CIKM '05.

[16]  Csilla Farkas,et al.  RDF metadata for XML access control , 2003, XMLSEC '03.

[17]  Mihir Bellare,et al.  Practice-Oriented Provable-Security , 1997, ISW.

[18]  Pieter H. Hartel,et al.  Efficient Tree Search in Encrypted Data , 2004, Inf. Secur. J. A Glob. Perspect..

[19]  Chris Clifton,et al.  Security Issues in Querying Encrypted Data , 2005, DBSec.

[20]  Sushil Jajodia,et al.  Balancing confidentiality and efficiency in untrusted relational DBMSs , 2003, CCS '03.

[21]  Radek Vingralek,et al.  How to build a trusted database system on untrusted storage , 2000, OSDI.

[22]  Rafail Ostrovsky,et al.  Replication is not needed: single database, computationally-private information retrieval , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[23]  Dan Suciu,et al.  A formal analysis of information disclosure in data exchange , 2004, SIGMOD '04.