Privacy-preserving LOF outlier detection

LOF is a well-known approach for density-based outlier detection and has received much attention recently. It is important to design a privacy-preserving LOF outlier detection algorithm as the data on which LOF runs is typically spilt among multiple participants and no one is willing to disclose his sensitive information due to legal or moral considerations. This is, however, a hard problem since participants need to find the maximum one of the distances between an object and its k-Nearest Neighbors (k-NN) without learning the information of these objects. In this paper, we propose an efficient protocol for privacy-preserving LOF outlier detection. We first employ a shuffle protocol to permute the distance vectors owned by different participants. Then, we design a secure selection method to obtain the garbled k-NN indexes and shares of k-distance for given objects. For each object, we make use of the k-distance of all objects to construct a vector, based on which the permute protocol is executed again to obtain new shares of k-distance. Finally, the shares corresponding to the garbled k-NN indexes are selected as the expected result. Our protocol ensures that all the intermediates are shared between multiple participants and thus avoid information leaking. In addition, our protocol is efficient as we prove that the computation and communication complexity of our protocol is bounded by $$O(n^2)$$O(n2).

[1]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[2]  Ran Canetti,et al.  Universally composable security: a new paradigm for cryptographic protocols , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[3]  Ahmad-Reza Sadeghi,et al.  Improved Garbled Circuit Building Blocks and Applications to Auctions and Computing Minima , 2009, IACR Cryptol. ePrint Arch..

[4]  Jan Willemson,et al.  Round-Efficient Oblivious Database Manipulation , 2011, ISC.

[5]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Oded Goldreich,et al.  Foundations of Cryptography: Volume 2, Basic Applications , 2004 .

[7]  Herbert Burkert,et al.  Some Preliminary Comments on the DIRECTIVE 95/46/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. , 1996 .

[8]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[9]  Mikhail J. Atallah,et al.  Efficient Privacy-Preserving k-Nearest Neighbor Search , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[10]  Artak Amirbekyan,et al.  Practical protocol for Yao’s millionaires problem enables secure multi-party computation of metrics and efficient privacy-preserving k-NN for large data sets , 2009, Knowledge and Information Systems.

[11]  Bart Goethals,et al.  On Private Scalar Product Computation for Privacy-Preserving Data Mining , 2004, ICISC.

[12]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System , 2004, USENIX Security Symposium.

[13]  Dan Bogdanov,et al.  High-performance secure multi-party computation for data mining applications , 2012, International Journal of Information Security.

[14]  Jonathan Katz,et al.  Faster Secure Two-Party Computation Using Garbled Circuits , 2011, USENIX Security Symposium.

[15]  Joydeep Ghosh,et al.  Privacy-preserving distributed clustering using generative models , 2003, Third IEEE International Conference on Data Mining.

[16]  Donald Beaver,et al.  Secure multiparty protocols and zero-knowledge proof systems tolerating a faulty minority , 2004, Journal of Cryptology.

[17]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[18]  Mihir Bellare,et al.  Foundations of garbled circuits , 2012, CCS.

[19]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[20]  Yehuda Lindell,et al.  A Proof of Yao's Protocol for Secure Two-Party Computation , 2004, Electron. Colloquium Comput. Complex..

[21]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[22]  Paul F. Syverson,et al.  Onion routing , 1999, CACM.

[23]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[24]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[25]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System (Awarded Best Student Paper!) , 2004 .

[26]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[27]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[28]  Wenliang Du,et al.  Privacy-preserving cooperative statistical analysis , 2001, Seventeenth Annual Computer Security Applications Conference.

[29]  Chris Clifton,et al.  Privacy-Preserving Kth Element Score over Vertically Partitioned Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[30]  Yehuda Lindell,et al.  Implementing Two-Party Computation Efficiently with Security Against Malicious Adversaries , 2008, SCN.

[31]  Abhi Shelat,et al.  Efficient Secure Computation with Garbled Circuits , 2011, ICISS.

[32]  Wei Zhao,et al.  A new scheme on privacy-preserving data classification , 2005, KDD '05.

[33]  Benny Pinkas,et al.  FairplayMP: a system for secure multi-party computation , 2008, CCS.

[34]  Douglas Wikström,et al.  A Universally Composable Mix-Net , 2004, TCC.

[35]  Nicholas Hopper,et al.  Scalable onion routing with torsk , 2009, CCS.

[36]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[37]  Chris Clifton,et al.  Privacy-preserving outlier detection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[38]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[39]  Oded Goldreich Foundations of Cryptography: Index , 2001 .

[40]  Ahmad-Reza Sadeghi,et al.  TASTY: tool for automating secure two-party computations , 2010, CCS '10.

[41]  Abhi Shelat,et al.  Billion-Gate Secure Computation with Malicious Adversaries , 2012, USENIX Security Symposium.

[42]  Chris Clifton,et al.  Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data , 2004, SDM.

[43]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[44]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[45]  Benny Pinkas,et al.  Secure Two-Party Computation is Practical , 2009, IACR Cryptol. ePrint Arch..