Enabling secure and effective near-duplicate detection over encrypted in-network storage

Near-duplicate detection (NDD) plays an essential role for effective resource utilization and possible traffic alleviation in many emerging network architectures, leveraging in-network storage for various content-centric services. As innetwork storage grows, data security has become one major concern. Though encryption is viable for in-network data protection, current techniques are still lacking for effectively locating encrypted near-duplicate data, making the benefits of NDD practically invalidated. Besides, adopting encrypted innetwork storage further complicates the user authorization when locating near-duplicate data from multiple content providers under different keys. In this paper, we propose a secure and effective NDD system over encrypted in-network storage supporting multiple content providers. Our design bridges locality-sensitive hashing (LSH) with a newly developed cryptographic primitive, multi-key searchable encryption, which allows the user to send only one encrypted query to access near-duplicate data encrypted under different keys. It relieves the users from multiple rounds of interactions or sending multiple different queries respectively. As simply applying LSH does not ensure the detection quality, we then leverage Yao's garbled circuits to build a secure protocol to obtain highly accurate results, without user-side post-processing. We formally analyze the security strength. Experiments demonstrate our system achieves practical performance with comparable accuracy to plaintext.

[1]  Kartik Nayak,et al.  ObliVM: A Programming Framework for Secure Computation , 2015, 2015 IEEE Symposium on Security and Privacy.

[2]  Yiwei Thomas Hou,et al.  Protecting your right: Attribute-based keyword search with fine-grained owner-enforced search authorization in the cloud , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[3]  Patrick Crowley,et al.  Named data networking , 2014, CCRV.

[4]  Naranker Dulay,et al.  Shared and Searchable Encrypted Data for Untrusted Servers , 2008 .

[5]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[6]  Akamai , 2018, Math Horizons.

[7]  Mihir Bellare,et al.  DupLESS: Server-Aided Encryption for Deduplicated Storage , 2013, USENIX Security Symposium.

[8]  Hugo Krawczyk,et al.  Outsourced symmetric private information retrieval , 2013, IACR Cryptol. ePrint Arch..

[9]  Cong Wang,et al.  Enabling Privacy-Preserving Image-Centric Social Discovery , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[10]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[11]  Cong Wang,et al.  Harnessing encrypted data in cloud for secure and efficient image sharing from mobile devices , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[12]  Bart Thomee,et al.  Large scale image copy detection evaluation , 2008, MIR '08.

[13]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: Improved definitions and efficient constructions , 2011, J. Comput. Secur..

[14]  Rafail Ostrovsky,et al.  Public Key Encryption with Keyword Search , 2004, EUROCRYPT.

[15]  Xue Liu,et al.  Smart in-network deduplication for storage-aware SDN , 2013, SIGCOMM.

[16]  Cong Wang,et al.  Enabling Encrypted Cloud Media Center with Secure Deduplication , 2015, AsiaCCS.

[17]  Ming Li,et al.  Authorized Private Keyword Search over Encrypted Data in Cloud Computing , 2011, 2011 31st International Conference on Distributed Computing Systems.

[18]  Nickolai Zeldovich,et al.  Multi-Key Searchable Encryption , 2013, IACR Cryptol. ePrint Arch..

[19]  Murat Kantarcioglu,et al.  Efficient Similarity Search over Encrypted Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[20]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[21]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[22]  Hari Balakrishnan,et al.  Building Web Applications on Top of Encrypted Data Using Mylar , 2014, NSDI.

[23]  Xue Liu,et al.  SmartEye: Real-time and efficient cloud image sharing for disaster environments , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[24]  Robert H. Deng,et al.  Private Query on Encrypted Data in Multi-user Settings , 2008, ISPEC.

[25]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.