Optimizing Communications of Data Parallel Programs in Scalable Cluster Systems

With the improvement of personal computers and high-speed network, clusters have become the trend in designing high performance computing environments. As the researches of relative hardware and software technology of Cluster and Grid are constantly improved, the application of Cluster is growing popular. Due to information progress and increasing calculation capacity required by all kind of applications; the calculation has also extended to cross-network calculation. Through the Internet, which connects several Clusters, the mass calculation platform is combined into Cluster Grid. As a result of data partition and exchange that may happen during the executing program, communication localization becomes important in programming efficiency. This paper, then, proposes a mathematical method which achieves excellent data partitioning and maintains data calculation in local environment. We also conduct some theoretical analysis on the amounts of computing nodes and data partition in the hope of being applied to practical parallel environment and further to reduce communication cost.

[1]  Myong-Soon Park,et al.  Processor reordering algorithms toward efficient GEN_BLOCK redistribution , 2001, SAC.

[2]  Lionel M. Ni,et al.  Processor Mapping Techniques Toward Efficient Data Redistribution , 1995, IEEE Trans. Parallel Distributed Syst..

[3]  Jens Knoop,et al.  Distribution Assignment Placement: Effective Optimization of Redistribution Costs , 2002, IEEE Trans. Parallel Distributed Syst..

[4]  Viktor K. Prasanna,et al.  Efficient Algorithms for Block-Cyclic Redistribution of Arrays , 1999, Algorithmica.

[5]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[6]  Subhash Saini,et al.  Local grid scheduling techniques using performance prediction , 2003 .

[7]  Ching-Hsien Hsu,et al.  Localization Techniques for Cluster-Based Data Grid , 2005, ICA3PP.

[8]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[9]  Bu-Sung Lee,et al.  Key Message Algorithm: a communication optimization algorithm in cluster-based parallel computing , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[10]  Yves Robert,et al.  Optimal algorithms for scheduling divisible workloads on heterogeneous systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[11]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[12]  Ian T. Foster Building an open grid , 2003, Second IEEE International Symposium on Network Computing and Applications, 2003. NCA 2003..

[13]  Minyi Guo,et al.  A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers , 2001, The Journal of Supercomputing.

[14]  Peter E. Strazdins,et al.  Optimizing user-level communication patterns on the Fujitsu AP3000 , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[15]  Do-Hyeon Kim,et al.  Design and Implementation of Integrated Information System for Monitoring Resources in Grid Computing , 2006, 2006 10th International Conference on Computer Supported Cooperative Work in Design.

[16]  Hesham El-Rewini,et al.  Distributed and Parallel Computing , 1998 .

[17]  Yolanda Gil,et al.  The Role of Planning in Grid Computing , 2003, ICAPS.

[18]  Henri E. Bal,et al.  Optimizing parallel applications for wide-area clusters , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.