BroadScale: Efficient scaling of heterogeneous storage systems

Scalable storage architectures enable digital libraries and archives for the addition or removal of storage devices to increase storage capacity and bandwidth or retire older devices. Past work in this area have mainly focused on statically scaling homogeneous storage devices. However, heterogeneous devices are quickly being adopted for storage scaling since they are usually faster, larger, more widely available, and more cost-effective. We propose BroadScale, an algorithm based on Random Disk Labeling, to dynamically scale heterogeneous storage systems by distributing data objects according to their device weights. Assuming a random placement of objects across a group of heterogeneous storage devices, our optimization objectives when scaling are to ensure a uniform distribution of objects, redistribute a minimum number of objects, and maintain fast data access with low computational complexity. We show through experimentation that BroadScale achieves these requirements when scaling heterogeneous storage.

[1]  Keith W. Ross,et al.  Hash routing for collections of shared Web caches , 1997, IEEE Netw..

[2]  Jose Renato Santos,et al.  RIO: a real-time multimedia object server , 1997, PERV.

[3]  Vldb Endowment,et al.  The VLDB journal : the international journal on very large data bases. , 1992 .

[4]  Prashant J. Shenoy,et al.  Rules of thumb in data engineering , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[5]  Shahram Ghandeharizadeh,et al.  Continuous media placement and scheduling in heterogeneous disk storage systems , 1998 .

[6]  Cyrus Shahabi,et al.  Hash-based labeling techniques for storage scaling , 2004, The VLDB Journal.

[7]  David Thaler,et al.  Using name-based mappings to increase hit rates , 1998, TNET.

[8]  Prashant J. Shenoy,et al.  Architectural considerations for next-generation file systems , 1999, MULTIMEDIA '99.

[9]  Asit Dan,et al.  An online video placement policy based on bandwidth to space ratio (BSR) , 1995, SIGMOD '95.

[10]  Ashish Goel,et al.  SCADDAR: an efficient randomized technique to reorganize continuous media blocks , 2002, Proceedings 18th International Conference on Data Engineering.

[11]  Jose Renato Santos,et al.  Performance analysis of the RIO multimedia storage system with heterogeneous disk configurations , 1998, MULTIMEDIA '98.

[12]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[13]  Kevin Walker,et al.  Metadata's Role in a Scientific Archive , 2003, Computer.

[14]  Jeffrey Considine,et al.  Simple Load Balancing for Distributed Hash Tables , 2003, IPTPS.

[15]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[16]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[17]  Rajeev Rastogi,et al.  The Fellini Multimedia Storage Server , 1996 .

[18]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[19]  Ethan L. Miller,et al.  A fast algorithm for online placement and reorganization of replicated data , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[20]  David Hung-Chang Du,et al.  Weighted striping in multimedia servers , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[21]  Berthier A. Ribeiro-Neto,et al.  Comparing random data allocation and data striping in multimedia servers , 2000, SIGMETRICS '00.

[22]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[23]  Shahram Ghandeharizadeh,et al.  On-line Reorganization of Data in Scalable Continuous Media Servers , 1996, DEXA.

[24]  S. K. Park,et al.  Random number generators: good ones are hard to find , 1988, CACM.

[25]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[26]  John C. S. Lui,et al.  Striping doesn't scale: how to achieve scalability for continuous media servers with replication , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[27]  Cyrus Shahabi,et al.  Yima: A Second-Generation Continuous Media Server , 2002, Computer.