Processor mapping technique for communication free data redistribution on symmetrical matrix

In this paper, we present the processor mapping technique to eliminate amount of data exchange in runtime data redistribution on symmetric matrices. The main idea of the proposed technique is to develop mathematical functions for mapping destination processors to a new sequence of processor id. The realigned order of destination processors is then used to perform data redistribution in the receiving phase. Together with a local matrix transposition scheme, interprocessor communication can be totally eliminated in runtime redistribution. The other improvement of this approach is that one does not need to compute the complicated communication sets. The indexing cost is reduced largely. The theoretical analysis shows that (p-1)/p data transmission cost can be saved for a redistribution over p/spl times/p processors grid. Experimental result also shows that the processor mapping technique provides superior improvement for runtime data redistribution.

[1]  Ching-Hsien Hsu,et al.  Efficient Methods for Multi-Dimensional Array Redistribution , 2004, The Journal of Supercomputing.

[2]  J. Ramanujam,et al.  Multi-phase array redistribution: modeling and evaluation , 1995, Proceedings of 9th International Parallel Processing Symposium.

[3]  Ching-Hsien Hsu,et al.  A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution , 2000, IEEE Trans. Parallel Distributed Syst..

[4]  Minyi Guo,et al.  A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers , 2001, The Journal of Supercomputing.

[5]  Bernard Tourancheau,et al.  Fast Runtime Block Cyclic Data Redistribution on Multiprocessors , 1997, J. Parallel Distributed Comput..

[6]  Michael Wolfe,et al.  Optimization of Array Redistribution for Distributed Memory Multicomputers , 1995, Parallel Comput..

[7]  PeiZong Lee,et al.  Compiler techniques for determining data distribution and generating communication sets on distributed-memory machines , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[8]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[9]  John R. Gilbert,et al.  Generating Local Address and Communication Sets for Data-Parallel Programs , 1995, J. Parallel Distributed Comput..

[10]  J. Ramanujam,et al.  HPF Array Statements: Communication Generation and Optimization , 1995 .

[11]  Myong-Soon Park,et al.  Processor reordering algorithms toward efficient GEN_BLOCK redistribution , 2001, SAC.

[12]  Prithviraj Banerjee,et al.  Optimizations for Efficient Array Redistribution on Distributed Memory Multicomputers , 1996, J. Parallel Distributed Comput..

[13]  Eduard Ayguadé,et al.  A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors , 2001, IEEE Trans. Parallel Distributed Syst..

[14]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[15]  Yi Pan,et al.  An Efficient Algorithm for Irregular Redistributions in Parallelizing Compilers , 2003, ISPA.

[16]  Yves Robert,et al.  Scheduling Block-Cyclic Array Redistribution , 1998, IEEE Trans. Parallel Distributed Syst..

[17]  Ching-Hsien Hsu,et al.  A Generalized Processor Mapping Technique for Array Redistribution , 2001, IEEE Trans. Parallel Distributed Syst..

[18]  Lionel M. Ni,et al.  Processor Mapping Techniques Toward Efficient Data Redistribution , 1995, IEEE Trans. Parallel Distributed Syst..