Distributed scheduling algorithms to improve the performance of parallel data transfers

The cost of data transfers, and in particular of I/O operations, is a growing problem in parallel computing. A promising approach to alleviating this bottleneck is to schedule parallel I/O operations explicitly. We develop a class of decentralized algorithms for scheduling parallel I/O operations, where the objective is to reduce the time required to complete a given set of transfers. These algorithms, based on edge-coloring and matching of bipartite graphs, rely upon simple heuristics to obtain shorter schedules. We present simulation results indicating that the best of our algorithms can produce schedules whose length is within 2--20% of the optimal schedule, a substantial improvement on previous decentralized algorithms. We discuss theoretical and experimental work in progress and possible extensions.

[1]  Jeffrey Scott Vitter,et al.  Optimal disk I/O with parallel block transfer , 1990, STOC '90.

[2]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[3]  Michael Stonebraker,et al.  Distributed RAID-a new multiple copy algorithm , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[4]  Aravind Srinivasan,et al.  Fast randomized algorithms for distributed edge coloring , 1992, PODC '92.

[5]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[6]  Ravi Jain Scheduling data transfers in parallel computers and communications systems , 1992 .

[7]  Michael Luby Removing randomness in parallel computation without a processor penalty , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[8]  Lucio Bianco,et al.  Scheduling Preemptive Multiprocessor Tasks on Dedicated Processors , 1994, Perform. Evaluation.

[9]  JainRavi,et al.  Distributed scheduling algorithms to improve the performance of parallel data transfers , 1994 .

[10]  David B. Shmoys,et al.  Efficient Parallel Algorithms for Edge Coloring Problems , 1987, J. Algorithms.

[11]  Ravi Jain,et al.  Scheduling Parallel I/O Operations in Multiple Bus Systems , 1992, J. Parallel Distributed Comput..

[12]  C. Berge,et al.  Minimax Theorems for Normal Hypergraphs and Balanced Hypergraphs — A Survey , 1984 .

[13]  Michael Stumm,et al.  HFS: A Flexible File System for large-scale Multiprocessors , 1993 .

[14]  Jeffrey Scott Vitter,et al.  Large-Scale Sorting in Uniform Memory Hierarchies , 1993, J. Parallel Distributed Comput..

[15]  Michael Luby Removing Randomness in Parallel Computation without a Processor Penalty , 1993, J. Comput. Syst. Sci..

[16]  Mahadev Satyanarayanan,et al.  Informed prefetch-ing: Converting high throughput to low latency , 1993 .

[17]  Lucio Bianco,et al.  Scheduling multiprocessor tasks on a dynamic configuration of dedicated processors , 1995, Ann. Oper. Res..

[18]  Dannie Durand,et al.  Impact of Memory Contention on Dynamic Scheduling on Numa Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[19]  Dror G. Feitelson,et al.  Overview of the Vesta parallel file system , 1993, CARN.

[20]  Thanasis Tsantilas,et al.  Efficient optical communication in parallel computers , 1992, SPAA '92.

[21]  Thomas E. Anderson,et al.  High-speed switch scheduling for local-area networks , 1993, TOCS.

[22]  Jeffrey Scott Vitter,et al.  Paradigms for optimal sorting with multiple disks , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[23]  B SaxeJames,et al.  High speed switch scheduling for local area networks , 1992 .

[24]  Thomas H. Cormen Fast Permuting on Disk Arrays , 1993, J. Parallel Distributed Comput..

[25]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.