Secrecy and performance models for query processing on outsourced graph data

Database outsourcing is a challenging task concerning data secrecy. Even if an adversary, including the service provider, accesses the data, she should not be able to learn any information from the accessed data. In this paper we address this problem for graph-structured data. First, we define a secrecy notion for graph-structured data based on the concept of indistinguishability. The notion ensures that an adversary can learn the edges existing between the nodes only with negligible probability. To address this problem, we propose an approach based on bucketization. Next to bucketization, it makes use of obfuscated indexes and encryption. We show that finding an optimal bucketization tailored to graph-structured data is NP-hard; therefore we come up with a heuristic. We prove that the proposed bucketization approach fulfills our secrecy notion. In addition, we present a performance model which consists of (1) a number of buckets model that estimates the number of buckets obtained after applying our bucketization approach and (2) a query-cost model. Finally, we demonstrate with a set of experiments (1) the accuracy of our number of buckets model for scale-free networks and (2) the efficiency of our approach with respect to query processing.