Data De-duplication Using Cuckoo Hashing in Cloud Storage

Cloud computing facilitates on-demand and ubiquitous access to a centralized pool of resources such as applications, networks, and storage services. Redundant copies of the same data are stored in multiple places, thus occupying more space in servers. Recent increase in computing leads to enormous volume of data, which are backed up and stored in cloud and made available to address consent, real-time insights and in regulating the data. In order to address the above problem, an enhanced cuckoo hashing algorithm is proposed for identifying duplicate data. Cuckoo hashing performs insertion, deletion, and retrieval in constant time. The metadata of files such as the basic file attributes, user-defined attributes, and user principal owner attributes is preserved after de-duplication. The experimental results show the efficiency of cuckoo hashing in the de-duplication of data chunks in the cloud environment.

[1]  Dan Feng,et al.  A Collision-Mitigation Cuckoo Hashing Scheme for Large-Scale Storage Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[2]  Robert H. Deng,et al.  Attribute-Based Storage Supporting Secure Deduplication of Encrypted Data in Cloud , 2019, IEEE Transactions on Big Data.

[3]  Hyotaek Lim,et al.  A new content-defined chunking algorithm for data deduplication in cloud storage , 2017, Future Gener. Comput. Syst..

[4]  Shmuel Tomi Klein,et al.  Optimal partitioning of data chunks in deduplication systems , 2013, Discret. Appl. Math..

[5]  P. Jain,et al.  A Survey Paper on Cloud Computing , 2012, 2012 Second International Conference on Advanced Computing & Communication Technologies.

[6]  Reshma A. Fegade,et al.  Cloud iDedup: History aware in-line Deduplication for cloud storage to reduce fragmentation by utilizing Cache Knowledge , 2016, 2016 International Conference on Computing, Analytics and Security Trends (CAST).

[7]  Kuan-Ching Li,et al.  DAC: Improving storage availability with Deduplication-Assisted Cloud-of-Clouds , 2017, Future Gener. Comput. Syst..