An Efficient and Secure Deduplication Scheme Based on Rabin Fingerprinting in Cloud Storage

Data deduplication has been widely used in backups to save storage space and network bandwidth. In order to improve efficiency and security of the current deduplication schemes, this paper proposes a secure deduplication scheme. The proposed scheme supports cloud storage servers to eliminate deduplicate data before users' encryption operations, which can reduce computation overheads. Our scheme realizes variable-size block-level deduplication based on the technique of Rabin fingerprinting. Rabin fingerprinting selects blocks based on property of the block contents and hence supports data update and variable files. In the proposed scheme, a trusted third-party server is introduced to randomize the convergent keys and manage them. Security analysis indicates the proposed scheme is secure against offline brute-force dictionary attacks.

[1]  Marvin Theimer,et al.  Reclaiming space from duplicate files in a serverless distributed file system , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[2]  Jin Li,et al.  Ensuring attribute privacy protection and fast decryption for outsourced data security in mobile cloud computing , 2017, Inf. Sci..

[3]  Yunhao Liu,et al.  Towards Network-level Efficiency for Cloud Storage Services , 2014, Internet Measurement Conference.

[4]  Jin Li,et al.  Secure Deduplication with Efficient and Reliable Convergent Key Management , 2014, IEEE Transactions on Parallel and Distributed Systems.

[5]  Mihir Bellare,et al.  DupLESS: Server-Aided Encryption for Deduplicated Storage , 2013, USENIX Security Symposium.

[6]  Chunyi Peng,et al.  An empirical analysis of similarity in virtual machine images , 2011, Middleware '11.

[7]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[8]  Le Zhang,et al.  Fast and Secure Laptop Backups with Encrypted De-duplication , 2010, LISA.

[9]  A. Broder Some applications of Rabin’s fingerprinting method , 1993 .

[10]  Darrell D. E. Long,et al.  Secure data deduplication , 2008, StorageSS '08.

[11]  Mihir Bellare,et al.  Message-Locked Encryption and Secure Deduplication , 2013, EUROCRYPT.

[12]  Christian Esposito,et al.  Smart Cloud Storage Service Selection Based on Fuzzy Logic, Theory of Evidence and Game Theory , 2016, IEEE Transactions on Computers.

[13]  G. Nagappan,et al.  A Survey on Secure Cloud Storage with Techniques Like Data Deduplication and Convergent Key management , 2016 .

[14]  Kamalinder Kaur,et al.  Hybrid information security model for cloud storage systems using hybrid data security scheme , 2016 .

[15]  Brian D. Noble,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy , 2022 .

[16]  Soofiya,et al.  Secure Data Storage with Deduplication and Efficient Convergent Key Management , 2016 .

[17]  Jin Li,et al.  Rekeying for Encrypted Deduplication Storage , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[18]  Jia Xu,et al.  Auditing the Auditor: Secure Delegation of Auditing Operation over Cloud Storage , 2011, IACR Cryptol. ePrint Arch..

[19]  Feng Wang,et al.  On the impact of virtualization on Dropbox-like cloud file storage/synchronization services , 2012, 2012 IEEE 20th International Workshop on Quality of Service.

[20]  Sherman S. M. Chow,et al.  Towards Proofs of Ownership Beyond Bounded Leakage , 2016, ProvSec.

[21]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.