On Store Placement for Response Time Minimization in Parallel Disks

We investigate the placement of N enterprise data-stores (e.g., database tables, application data) across an array of disks with the aim of minimizing the response time averaged over all served requests, while balancing the load evenly across all the disks in the parallel disk array. Incorporating the non-FCFS serving discipline and non work-conserving nature of disk drives in formulation of the placement problem is difficult and current placement strategies do not take them into account. We present a novel formulation of the placement problem to incorporate these crucial features and identify the runlength of requests accessing a store as the most important criterion for placing the stores. We use these insights to design a fast (running time of N logN) placement algorithm that is optimal under the assumption that transfer times are small. Comprehensive experimental studies establish the efficacy of the proposed algorithm under a wide variety of workloads with the proposed algorithm reducing the response time for real storage traces by more than a factor of 2 under heterogeneous workload scenarios.

[1]  Mor Harchol-Balter,et al.  On Choosing a Task Assignment Policy for a Distributed Server System , 1998, J. Parallel Distributed Comput..

[2]  A. L. Narasimha Reddy,et al.  Disk scheduling in a multimedia I/O system , 1993, MULTIMEDIA '93.

[3]  Prashant J. Shenoy,et al.  Cello: A Disk Scheduling Framework for Next Generation Operating Systems* , 1998, SIGMETRICS '98/PERFORMANCE '98.

[4]  Philip S. Yu,et al.  Using rotational mirrored declustering for replica placement in a disk-array-based video server , 1997, MULTIMEDIA '95.

[5]  Mark S. Squillante,et al.  On maximizing service-level-agreement profits , 2001, PERV.

[6]  Prasant Mohapatra,et al.  An Admission Control Scheme for Predictable Server Response Time for Web Accesses , 2001, WWW '01.

[7]  Eric Anderson,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Hippodrome: Running Circles around Storage Administration , 2022 .

[8]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[9]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[10]  Marco Ajmone Marsan,et al.  An MMPP-based hierarchical model of Internet traffic , 2004, 2004 IEEE International Conference on Communications (IEEE Cat. No.04CH37577).

[11]  Peter Scheuermann,et al.  File Assignment in Parallel I/O Systems with Minimal Variance of Service Time , 2000, IEEE Trans. Computers.

[12]  Gerhard Weikum,et al.  Dynamic file allocation in disk arrays , 1991, SIGMOD '91.

[13]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[14]  Gerhard Weikum,et al.  Data partitioning and load balancing in parallel disk systems , 1998, The VLDB Journal.

[15]  Carl M. Harris,et al.  Fundamentals of queueing theory , 1975 .