Variable-size data item placement for load and storage balancing

The rapid growth of Internet brings the need for a low cost high performance file system. Two objectives are to be pursued in building such a large scale storage system on multiple disks: load balancing and storage minimization. We investigate the optimization problem of placing variable-size data items onto multiple disks with replication to achieve the two objectives. An approximate algorithm, called LSB_Placement, is proposed for the optimization problem. The algorithm performs bin packing along with MMPacking to obtain a load balanced placement with near-optimal storage balancing. The key issue in deriving the algorithm is to find the optimal bin capacity for the bin packing to reduce storage cost. We derive the optimal bin capacity and prove that LSB_Placement algorithm is asymptotically 1-optimal on storage balancing. That is, when the problem size exceeds certain threshold, the algorithm generates a load balanced placement in which the data sizes allocated on disks are almost balanced. We demonstrate that, for various Web applications, a load balanced placement can be generated with disk capacity not exceeding 10% more than the balanced storage space. This shows that the LSB_Placement algorithm is useful in constructing a low cost and high performance storage system.

[1]  I. H. Donnar Photocopyright infringement , 1995 .

[2]  Thomas D. C. Little,et al.  Popularity-based assignment of movies to storage devices in a video-on-demand system , 1995, Multimedia Systems.

[3]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[4]  Sampath Rangarajan,et al.  Data distribution algorithms for load balanced fault-tolerant Web access , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.

[5]  Peter Scheuermann,et al.  File Assignment in Parallel I/O Systems with Minimal Variance of Service Time , 2000, IEEE Trans. Computers.

[6]  Krishna R. Pattipati,et al.  A file assignment problem model for extended local area network environments , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[7]  오병균,et al.  [서평]「Computer Algorithms/C++」 , 1998 .

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Edward G. Coffman,et al.  An Application of Bin-Packing to Multiprocessor Scheduling , 1978, SIAM J. Comput..

[10]  Benjamin W. Wah File Placement on Distributed Computer Systems , 1984, Computer.

[11]  Arie Segev,et al.  Data Allocation for Multi-Disk Databases , 1993, IEEE Trans. Knowl. Data Eng..

[12]  Daniel A. Reed,et al.  NCSA's World Wide Web Server: Design and Performance , 1995, Computer.

[13]  Heeseok Lee,et al.  Allocating data and workload among multiple servers in a local area network , 1995, Inf. Syst..

[14]  Jeffrey D. Ullman,et al.  L worst-case performance bounds for rumple one-dimensional packing algorithms siam j , 1974 .

[15]  Robert E. McGrath,et al.  User access patterns to NCSA''s World Wide Web server , 1995 .

[16]  Dimitrios N. Serpanos,et al.  MMPacking: a load and storage balancing algorithm for distributed multimedia servers , 1998 .