Efficient Multidimensional Data Redistribution for Resizable Parallel Computations

Traditional parallel schedulers running on cluster supercomputers support only static scheduling, where the number of processors allocated to an application remains fixed throughout the execution of the job. This results in underutilization of idle system resources thereby decreasing overall system throughput. In our research, we have developed a prototype framework called ReSHAPE, which supports dynamic resizing of parallel MPI applications executing on distributed memory platforms. The resizing library in ReSHAPE includes support for releasing and acquiring processors and efficiently redistributing application state to a new set of processors. In this paper, we derive an algorithm for redistributing two-dimensional block-cyclic arrays from P to Q processors, organized as 2-D processor grids. The algorithm ensures a contention-free communication schedule for data redistribution if Pr ≤ Qr and Pc ≤ Qc. In other cases, the algorithm implements circular row and column shifts on the communication schedule to minimize node contention.

[1]  Prithviraj Banerjee,et al.  Optimizations for Efficient Array Redistribution on Distributed Memory Multicomputers , 1996, J. Parallel Distributed Comput..

[2]  Ching-Hsien Hsu,et al.  A Generalized Processor Mapping Technique for Array Redistribution , 2001, IEEE Trans. Parallel Distributed Syst..

[3]  Lionel M. Ni,et al.  Processor mapping techniques toward efficient data redistribution , 1994, Proceedings of 8th International Parallel Processing Symposium.

[4]  Viktor K. Prasanna,et al.  Efficient algorithms for multi-dimensional block-cyclic redistribution of arrays , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).

[5]  Rajeev Thakur,et al.  Efficient Algorithms for Array Redistribution , 1996, IEEE Trans. Parallel Distributed Syst..

[6]  David W. Walker,et al.  Redistribution of block-cyclic data distributions using MPI , 1996, Concurr. Pract. Exp..

[7]  Yi Pan,et al.  Improving communication scheduling for array redistribution , 2005, J. Parallel Distributed Comput..

[8]  Geoffrey C. Fox,et al.  Runtime array redistribution in HPF programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[9]  Viktor K. Prasanna,et al.  Efficient Algorithms for Block-Cyclic Redistribution of Arrays , 1999, Algorithmica.

[10]  Ching-Hsien Hsu,et al.  A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution , 2000, IEEE Trans. Parallel Distributed Syst..

[11]  Jack Dongarra,et al.  ScaLAPACK user's guide , 1997 .

[12]  Yves Robert,et al.  Scheduling Block-Cyclic Array Redistribution , 1998, IEEE Trans. Parallel Distributed Syst..

[13]  Bernard Tourancheau,et al.  Efficient Block Cyclic Data Redistribution , 1996, Euro-Par, Vol. I.

[14]  Ching-Hsien Hsu,et al.  A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution , 1998, IEEE Trans. Parallel Distributed Syst..

[15]  Rajesh Sudarsan,et al.  ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[16]  Viktor K. Prasanna,et al.  Efficient Algorithms for Block-Cyclic Array Redistribution between Processor Sets , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[17]  P. Sadayappan,et al.  An approach to communication-efficient data redistribution , 1994, ICS '94.