Mitigating Traffic-Based Side Channel Attacks in Bandwidth-Efficient Cloud Storage

Data deduplication is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in distributed storage systems and cloud storage to save the users' network bandwidth for uploading files. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. An attacker can carry out a very dangerous side channel attack, i.e., learn-the-remaining-information (LRI) attack, to reveal users' privacy information by exploiting the side channel of network traffic in deduplication. Existing work addresses the LRI attack at the cost of the high bandwidth consumption. In order to address this problem, we propose a simple yet effective scheme, called randomized redundant chunk scheme (RRCS), to significantly mitigate the risk of the LRI attack while maintaining the high bandwidth efficiency of deduplication. The idea behind RRCS is to add randomized redundant chunks to mix up the real deduplication states of files used for the LRI attack, which effectively obfuscates the view of the attacker, who attempts to exploit the side channel of network traffic for the LRI attack. Our security analysis shows that RRCS significantly mitigates the risk of the LRI attack. We have implemented the RRCS prototype and evaluated it by using three real-world datasets. Experimental results demonstrate RRCS significantly outperforms existing work in terms of bandwidth efficiency.

[1]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[2]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[3]  Christoph Neumann,et al.  Improving the Resistance to Side-Channel Attacks on Cloud Storage Services , 2012, 2012 5th International Conference on New Technologies, Mobility and Security (NTMS).

[4]  Jie Wu,et al.  Improving Restore Performance in Deduplication Systems via a Cost-Efficient Rewriting Scheme , 2019, IEEE Transactions on Parallel and Distributed Systems.

[5]  Darrell D. E. Long,et al.  Secure data deduplication , 2008, StorageSS '08.

[6]  Mihir Bellare,et al.  Message-Locked Encryption and Secure Deduplication , 2013, EUROCRYPT.

[7]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[8]  Refik Molva,et al.  ClouDedup: Secure Deduplication with Encrypted Data for Cloud Storage , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[9]  Anne-Marie Kermarrec,et al.  Probabilistic deduplication for cluster-based storage systems , 2012, SoCC '12.

[10]  Yu Hua,et al.  A Cost-efficient Rewriting Scheme to Improve Restore Performance in Deduplication Systems , 2017 .

[11]  Erez Zadok,et al.  Generating Realistic Datasets for Deduplication Analysis , 2012, USENIX Annual Technical Conference.

[12]  Hong Jiang,et al.  SiLo: A Similarity-Locality based Near-Exact Deduplication Scheme with Low RAM Overhead and High Throughput , 2011, USENIX Annual Technical Conference.

[13]  Marvin Theimer,et al.  Reclaiming space from duplicate files in a serverless distributed file system , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[14]  Hong Jiang,et al.  DEBAR: A scalable high-performance de-duplication storage system for backup and archiving , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[15]  Jin Li,et al.  Secure Deduplication with Efficient and Reliable Convergent Key Management , 2014, IEEE Transactions on Parallel and Distributed Systems.

[16]  Jie Wu,et al.  BEES: Bandwidth- and Energy- Efficient Image Sharing for Real-Time Situation Awareness , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[17]  Cong Wang,et al.  Enabling Encrypted Cloud Media Center with Secure Deduplication , 2015, AsiaCCS.

[18]  Cong Wang,et al.  Bandwidth-efficient Storage Services for Mitigating Side Channel Attack , 2017, ArXiv.

[19]  Hong Jiang,et al.  FastCDC: a Fast and Efficient Content-Defined Chunking Approach for Data Deduplication , 2016, USENIX ATC.

[20]  Yucheng Zhang,et al.  Design Tradeoffs for Data Deduplication Performance in Backup Workloads , 2015, FAST.

[21]  Roberto Di Pietro,et al.  Boosting efficiency and security in proof of ownership for deduplication , 2012, ASIACCS '12.

[22]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[23]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[24]  Edgar R. Weippl,et al.  Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space , 2011, USENIX Security Symposium.

[25]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.

[26]  Kwangjo Kim,et al.  Differentially private client-side data deduplication protocol for cloud storage services , 2015, Secur. Commun. Networks.