Secure k Nearest Neighbors Query for High-Dimensional Vectors in Outsourced Environments

Due to the explosive increase of data in both the aspects of dimensionality and volume, performing $k$ nearest neighbors search over cloud environments has been progressively receiving more attention among researchers in the field of database cloud computing. However, the key challenge for switching $k$ nearest neighbors search from the local server (i.e., traditional way) to the third-party cloud is, that the database which always contains series of sensitive information has to be kept secret against the cloud. In this work, we present a pair of solutions towards Secure $k$ Nearest Neighbors (S $k$ NN) query in outsourced environments. By skillfully utilizing coarse quantization and the cryptography techniques Advanced Encryption Standard (AES) and Paillier homomorphic encryption, we construct a secure Inverted File (IVF) and compute encrypted approximate distances directly to search for high-dimensional data in the third-party cloud provider, and finally find the better tradeoff between the search quality and security. Empirical study over real datasets and practical environments validate our solutions’ feasibility, completeness, and practicality. Compared to the state-of-the-art, the proposed solutions resolve the S $k$ NN of high-dimensional data novelly, have very limited response time and provide high privacy protection on the side of both the User and the cloud provider.

[1]  Sviatoslav Voloshynovskiy,et al.  Privacy preserving multimedia content identification for cloud based bag-of-feature architectures , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[2]  Wei Jiang,et al.  Secure k-nearest neighbor query over encrypted data in outsourced environments , 2013, 2014 IEEE 30th International Conference on Data Engineering.

[3]  Samuel Madden,et al.  Processing Analytical Queries over Encrypted Data , 2013, Proc. VLDB Endow..

[4]  Fernando Pérez-González,et al.  Fully Private Noninteractive Face Verification , 2013, IEEE Transactions on Information Forensics and Security.

[5]  Xiuwen Liu,et al.  Product tree quantization for approximate nearest neighbor search , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[6]  Jeremy Buhler,et al.  Large-Scale Sequence Comparison by Locality-Sensitive Hashing , 2001 .

[7]  Panos Kalnis,et al.  Private queries in location based services: anonymizers are not necessary , 2008, SIGMOD Conference.

[8]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[9]  Feifei Li,et al.  Secure nearest neighbor revisited , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[10]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[12]  Elisa Bertino,et al.  Secure kNN Query Processing in Untrusted Cloud Environments , 2014, IEEE Transactions on Knowledge and Data Engineering.

[13]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[14]  Xiaodong Lin,et al.  SPOC: A Secure and Privacy-Preserving Opportunistic Computing Framework for Mobile-Healthcare Emergency , 2013, IEEE Transactions on Parallel and Distributed Systems.

[15]  Laurent Amsaleg,et al.  Fast and secure similarity search in high dimensional space , 2013, 2013 IEEE International Workshop on Information Forensics and Security (WIFS).

[16]  Zi Huang,et al.  SK-LSH: An Efficient Index Structure for Approximate Nearest Neighbor Search , 2014, Proc. VLDB Endow..

[17]  Wei Jiang,et al.  k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[18]  Catherine M. S. Redfield,et al.  CryptDB : Protecting Confidentiality with Encrypted Query Processing Raluca , 2011 .

[19]  Vincent Rijmen,et al.  Rijndael, the advanced encryption standard , 2001 .

[20]  Laurent Amsaleg,et al.  Secure and efficient approximate nearest neighbors search , 2013, IH&MMSec '13.

[21]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[22]  Jie Wu,et al.  Towards Differential Query Services in Cost-Efficient Clouds , 2014, IEEE Transactions on Parallel and Distributed Systems.

[23]  Yan Huang,et al.  Efficient Genome-Wide, Privacy-Preserving Similar Patient Query based on Private Edit Distance , 2015, CCS.

[24]  Hakan Hacigümüs,et al.  Providing database as a service , 2002, Proceedings 18th International Conference on Data Engineering.

[25]  Jianliang Xu,et al.  Processing private queries over untrusted data cloud through privacy homomorphism , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[26]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[27]  Keke Chen,et al.  Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation , 2012, IEEE Transactions on Knowledge and Data Engineering.

[28]  Yannis Avrithis,et al.  Locally Optimized Product Quantization for Approximate Nearest Neighbor Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[30]  J M Lewin,et al.  Comparison of full-field digital mammography with screen-film mammography for cancer detection: results of 4,945 paired examinations. , 2001, Radiology.