Superset: A Non-uniform Replica Placement Strategy towards High-Performance and Cost-Effective Distributed Storage Service

Load balance and power proportionality are both important aspects in constructing high-performance and cost-effective distributed storage systems. However, traditional replica placement strategies towards load balance usually produce scattered replica layouts which disable power proportionality, while recent strategies towards power proportionality are typically based on uniform replication which compromises the ability of load balance. In this article, we introduce Superset (an organized non-uniform replica placement strategy) which takes both load balance and power proportionality into consideration. The main idea is to partition the whole system into multiple uniform replication based subsystems with the accommodated file subsets satisfying the 'superset' condition. We have conducted a series of simulations with real-world distributions of data popularity. Our results show that, compared to state of the art solutions, Superset consumes less energy to fulfill the same performance requirement while offers better performance subject to the same energy consumption constraint.

[1]  Yolande Berbers,et al.  Analysis of disk power management for data-center storage systems , 2012, 2012 Third International Conference on Future Systems: Where Energy, Computing and Communication Meet (e-Energy).

[2]  Jun Wang,et al.  A New Placement-Ideal Layout for Multiway Replication Storage System , 2011, IEEE Transactions on Computers.

[3]  Prashant J. Shenoy,et al.  Energy-aware load balancing in content delivery networks , 2011, 2012 Proceedings IEEE INFOCOM.

[4]  GhemawatSanjay,et al.  The Google file system , 2003 .

[5]  Jignesh M. Patel,et al.  Energy management for MapReduce clusters , 2010, Proc. VLDB Endow..

[6]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[7]  Austin Donnelly,et al.  Sierra: practical power-proportionality for data center storage , 2011, EuroSys '11.

[8]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[9]  Alexander Russell,et al.  Data Migration in Heterogeneous Storage Systems , 2011, 2011 31st International Conference on Distributed Computing Systems.

[10]  Jeffrey S. Chase,et al.  Rethinking FTP: Aggressive block reordering for large file transfers , 2009, TOS.

[11]  Amin Vahdat,et al.  MediSyn: a synthetic streaming media service workload generator , 2003, NOSSDAV '03.

[12]  Dimitrios N. Serpanos,et al.  MMPacking: A load and storage balancing algorithm for distributed multimedia servers , 1996, Proceedings International Conference on Computer Design. VLSI in Computers and Processors.

[13]  Jinoh Kim,et al.  Energy Proportionality and Performance in Data Parallel Computing Clusters , 2011, SSDBM.

[14]  Richard A. Golding,et al.  D-SPTF: decentralized request distribution in brick-based storage systems , 2004, ASPLOS XI.

[15]  Junfeng Yang,et al.  Kinesis: A new approach to replica placement in distributed storage systems , 2009, TOS.

[16]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[17]  Dimitrios N. Serpanos,et al.  MMPacking: a load and storage balancing algorithm for distributed multimedia servers , 1998 .

[18]  Wei Chen,et al.  On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[19]  Dan Feng,et al.  CDRM: A Cost-Effective Dynamic Replication Management Scheme for Cloud Storage Cluster , 2010, 2010 IEEE International Conference on Cluster Computing.