Topk Queries across Multiple Private Databases

Advances in distributed service-oriented computing and global communications have formed a strong technology push for large-scale data integration among organizations and enterprises. However, concerns about data privacy become increasingly important for large-scale mission-critical data integration applications. Ideally, given a database query spanning multiple private databases, the authors wished to compute the answer to the query without revealing any additional information of each individual database apart from the query result. In practice, this constraint can be relaxed to allow efficient information integration while minimizing the information disclosure. In this paper, the authors proposed an efficient decentralized peer-to-peer protocol for supporting aggregate queries over multiple private databases while respecting the privacy constraints of participants. The paper has three main contributions. First, it formalizes the notion of loss of privacy in terms of information revealed at individual participating databases. Second, it presents a novel probabilistic decentralized protocol for topk selection across multiple private databases that minimizes the loss of privacy. Third, it experimentally evaluates the protocol in terms of its correctness, efficiency and privacy characteristics

[1]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[2]  Chris Clifton,et al.  Privacy - preserving top-k queries , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Sushil Jajodia,et al.  Toward a multilevel secure relational data model , 1991, SIGMOD '91.

[4]  Michael K. Reiter,et al.  Crowds: anonymity for Web transactions , 1998, TSEC.

[5]  Venkataraman Ramesh,et al.  Management of Heterogeneous and Autonomous Database Systems , 1999 .

[6]  Benny Pinkas,et al.  Secure computation of the kth-ranked element , 2004 .

[7]  Paul F. Syverson,et al.  Anonymous connections and onion routing , 1998, IEEE J. Sel. Areas Commun..

[8]  Li Xiao,et al.  Mutual anonymity protocols for hybrid peer-to-peer systems , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[9]  Alexandre V. Evfimievski,et al.  Information sharing across private databases , 2003, SIGMOD '03.

[10]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[11]  Gene Tsudik,et al.  A Privacy-Preserving Index for Range Queries , 2004, VLDB.

[12]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[13]  Micah Adler,et al.  Defending anonymous communications against passive logging attacks , 2003, 2003 Symposium on Security and Privacy, 2003..

[14]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[15]  Benny Pinkas,et al.  Secure Computation of the k th-Ranked Element , 2004, EUROCRYPT.

[16]  Rakesh Agrawal,et al.  Extending relational database systems to automatically enforce privacy policies , 2005, 21st International Conference on Data Engineering (ICDE'05).

[17]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[18]  Joan Feigenbaum,et al.  Decentralized trust management , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[19]  Ling Liu,et al.  PeerTrust: supporting reputation-based trust for peer-to-peer electronic communities , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Shafi Goldwasser,et al.  Multi party computations: past and present , 1997, PODC '97.

[21]  Ramakrishnan Srikant,et al.  Hippocratic Databases , 2002, VLDB.