A Balanced Allocation Strategy for File Assignment in Parallel I/O Systems

In parallel I/O systems, fast response to disk access and load balancing are two important performance objectives pursued by end users and applications. These performances are largely determined by the data allocation strategies, or file assignment algorithms. However, most existing algorithms can only obtain one of the performance objectives, including the well-known algorithms like Greedy, Sort Partition (SP) and Hybrid Partition (HP). New algorithms that can achieve both objectives are very necessary for parallel I/O systems. In this paper, we have proposed two new allocation algorithms for file assignment in parallel I/O systems: an offline Balanced Allocation with Sort (BAS) algorithm and an online Balanced Allocation with Sort for Batch (BASB) algorithm. Both algorithms aim to achieve the optimal mean response time and load balancing at the same time. The experiment results have shown that the BAS algorithm can get the optimal performance on response time among all compared algorithms and better performance on load balancing than SP. For online BASB algorithm, it can achieve the optimal performances on both response time and load balancing among all algorithms for comparison.

[1]  Tao Xie,et al.  A static data placement strategy towards perfect load-balancing for distributed storage clusters , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[2]  Heeseok Lee,et al.  Allocating data and workload among multiple servers in a local area network , 1995, Inf. Syst..

[3]  Rajiv M. Dewan,et al.  Models for the Combined Logical and Physical Design of Databases , 1989, IEEE Trans. Computers.

[4]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[5]  Benjamin W. Wah File Placement on Distributed Computer Systems , 1984, Computer.

[6]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[7]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[8]  Arie Segev,et al.  Data Allocation for Multi-Disk Databases , 1993, IEEE Trans. Knowl. Data Eng..

[9]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[10]  Yao Sun,et al.  A file assignment strategy independent of workload characteristic assumptions , 2009, TOS.

[11]  Joel L. Wolf,et al.  The placement optimization program: a practical solution to the disk file assignment problem , 1989, SIGMETRICS '89.

[12]  Krishna R. Pattipati,et al.  A file assignment problem model for extended local area network environments , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[13]  Philip S. Yu,et al.  A Parallel Hash Join Algorithm for Managing Data Skew , 1993, IEEE Trans. Parallel Distributed Syst..

[14]  Rahul Simha,et al.  A Microeconomic Approach to Optimal Resource Allocation in Distributed Computer Systems , 1989, IEEE Trans. Computers.

[15]  Akshat Verma,et al.  On Store Placement for Response Time Minimization in Parallel Disks , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[16]  Sangkyu Rho,et al.  Allocating Data and Operations to Nodes in Distributed Database Design , 1995, IEEE Trans. Knowl. Data Eng..

[17]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[18]  John Kunze,et al.  A trace-driven analysis of the unix 4 , 1985, SOSP 1985.

[19]  Gerhard Weikum,et al.  Data partitioning and load balancing in parallel disk systems , 1998, The VLDB Journal.

[20]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[21]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[22]  Kien A. Hua,et al.  A Self-Adjusting Data Distribution Mechanism for Multidimensional Load Balancing in Multiprocessor-Based Database Systems , 1994, Inf. Syst..

[23]  Peter Scheuermann,et al.  File Assignment in Parallel I/O Systems with Minimal Variance of Service Time , 2000, IEEE Trans. Computers.