RRSD: A file replication method for ensuring data reliability and reducing storage consumption in a dynamic Cloud-P2P environment

Abstract Data reliability, storage consumption and load balance have been widely concerned for current dynamic cloud storage. The traditional file replication methods can obtain better load balance and high data reliability via multi-replica, but leading to huge storage consumption. Although these methods reduce storage consumption by dynamically removing redundant replicas, data reliability is not ensured enough. To deal with the problem, this paper proposes a file replication method for ensuring data reliability and reducing storage consumption in a dynamic Cloud-P2P (RRSD), aiming to minimize the number of replicas to obtain better load balance while meeting data reliability requirements. RRSD uses the method of ”multiple times replica placement” and ”redundant replica deletion” to achieve the goal. It adopts the centralized manner to create minimal replicas that can meet the data reliability requirement according to file’s storage expectation to reduce storage consumption when a file stores on cloud. To respond to the time-varying dynamic cloud in time, this approach uses the manner of decentralized self-adaptive to dynamically create fewer replicas, and select optimal node as placement nodes to improve load balance. Meanwhile, RRSD uses a method of periodicity detection to ensure data reliability. In addition, it uses verification evaluation method to selectively remove redundant replicas to further reduce storage consumption. Extensive experiments demonstrate that RRSD has superior performance regarding load balance, data reliability and storage consumption and can deliver an improvement of 10% for load balance and reduce storage consumption by 60% while meeting data reliability requirement compared with other similar methods.

[1]  Fabio Kon,et al.  A hybrid cloud-P2P architecture for multimedia information retrieval on VoD services , 2014, Computing.

[2]  David A. Patterson,et al.  Designing Disk Arrays for High Data Reliability , 1993, J. Parallel Distributed Comput..

[3]  Kannan Ramchandran,et al.  A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers , 2015, SIGCOMM 2015.

[4]  Kang Chen,et al.  Social-P2P: An Online Social Network Based P2P File Sharing System , 2015, IEEE Transactions on Parallel and Distributed Systems.

[5]  Yun Yang,et al.  Ensuring Cloud Data Reliability with Minimum Replication by Proactive Replica Checking , 2016, IEEE Transactions on Computers.

[6]  D. West Introduction to Graph Theory , 1995 .

[7]  Haiying Shen,et al.  Swarm Intelligence Based File Replication and Consistency Maintenance in Structured P2P File Sharing Systems , 2015, IEEE Transactions on Computers.

[8]  Keke Gai,et al.  Energy-aware task assignment for mobile cyber-enabled applications in heterogeneous cloud computing , 2018, J. Parallel Distributed Comput..

[9]  Zheng Wei,et al.  Cloud Computing:System Instances and Current Research , 2009 .

[10]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[11]  Keke Gai,et al.  Blend Arithmetic Operations on Tensor-Based Fully Homomorphic Encryption Over Real Numbers , 2018, IEEE Transactions on Industrial Informatics.

[12]  Haiying Shen An Efficient and Adaptive Decentralized File Replication Algorithm in P2P File Sharing Systems , 2010, IEEE Trans. Parallel Distributed Syst..

[13]  Xiaofeng Chen,et al.  Secure Distributed Deduplication Systems with Improved Reliability , 2015, IEEE Trans. Computers.

[14]  Yunhao Liu,et al.  Challenges, Designs, and Performances of Large-Scale Open-P2SP Content Distribution , 2013, IEEE Transactions on Parallel and Distributed Systems.

[15]  Xin He,et al.  Optimal solution to intelligent multi-channel wireless communications using dynamic programming , 2018, The Journal of Supercomputing.

[16]  Haiying Shen,et al.  A Geographically Aware Poll-Based Distributed File Consistency Maintenance Method for P2P Systems , 2013, IEEE Transactions on Parallel and Distributed Systems.

[17]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[18]  Meikang Qiu,et al.  Data Placement and Duplication for Embedded Multicore Systems With Scratch Pad Memory , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  Keke Gai,et al.  A survey on FinTech , 2018, J. Netw. Comput. Appl..

[20]  Darrell D. E. Long,et al.  Providing High Reliability in a Minimum Redundancy Archival Storage System , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[21]  Wang Yi Key Technologies of Distributed Storage for Cloud Computing , 2012 .

[22]  Garth A. Gibson Redundant disk arrays: Reliable, parallel secondary storage. Ph.D. Thesis , 1990 .

[23]  Keke Gai,et al.  Spoofing-Jamming Attack Strategy Using Optimal Power Distributions in Wireless Smart Grid Networks , 2017, IEEE Transactions on Smart Grid.

[24]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[25]  Keke Gai,et al.  Privacy-Preserving Content-Oriented Wireless Communication in Internet-of-Things , 2018, IEEE Internet of Things Journal.