RARE: Defeating side channels based on data-deduplication in cloud storage

Client-side data deduplication enables cloud storage services (e.g., Dropbox) to achieve both storage and bandwidth savings, resulting in reduced operating cost and high level of user satisfaction. However, the deduplication checks (i.e., the corresponding essential message exchange) create a side channel, exposing the privacy of file existence status to the attacker. In particular, the binary response from the deduplication check reveals the information about the existence of a copy of the file in the cloud storage. This behavior can be exploited to launch further attacks such as learning the sensitive file content and establishing a covert channel. While current solutions provide only weaker privacy or rely on unreasonable assumptions, we propose RAndom REsponse (RARE) approach to achieve stronger privacy. The idea behind our proposed RARE solution is that the uploading user sends the deduplication request for two chunks at once. The cloud receiving the deduplication request returns the randomized deduplication response with the careful design so as to preserve the deduplication gain and at the same time minimize the privacy leakage. Our analytical results confirm privacy guarantee and results show that both deduplication benefit and privacy of RARE can be preserved.

[1]  Hong Jiang,et al.  FastCDC: a Fast and Efficient Content-Defined Chunking Approach for Data Deduplication , 2016, USENIX ATC.

[2]  Mihir Bellare,et al.  Message-Locked Encryption and Secure Deduplication , 2013, EUROCRYPT.

[3]  Mihir Bellare,et al.  DupLESS: Server-Aided Encryption for Deduplicated Storage , 2013, USENIX Security Symposium.

[4]  Mauro Conti,et al.  Privacy Aware Data Deduplication for Side Channel in Cloud Storage , 2020, IEEE Transactions on Cloud Computing.

[5]  Jin Li,et al.  Secure Deduplication with Efficient and Reliable Convergent Key Management , 2014, IEEE Transactions on Parallel and Distributed Systems.

[6]  Benny Pinkas,et al.  Secure Deduplication of Encrypted Data without Additional Independent Servers , 2015, CCS.

[7]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[8]  Christoph Neumann,et al.  Improving the Resistance to Side-Channel Attacks on Cloud Storage Services , 2012, 2012 5th International Conference on New Technologies, Mobility and Security (NTMS).

[9]  Kwangjo Kim,et al.  Differentially private client-side data deduplication protocol for cloud storage services , 2015, Secur. Commun. Networks.

[10]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[11]  Dooho Choi,et al.  Privacy-preserving cross-user source-based data deduplication in cloud storage , 2012, 2012 International Conference on ICT Convergence (ICTC).

[12]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.

[13]  Yiwei Thomas Hou,et al.  Modeling the side-channel attacks in data deduplication with game theory , 2015, 2015 IEEE Conference on Communications and Network Security (CNS).

[14]  Mohammad Mannan,et al.  An evaluation of recent secure deduplication proposals , 2016, J. Inf. Secur. Appl..

[15]  Hong Jiang,et al.  A Comprehensive Study of the Past, Present, and Future of Data Deduplication , 2016, Proceedings of the IEEE.