A parallel loop self-scheduling on grid computing environments

Internet computing and grid technologies promise to change the way we tackle complex problems. They will enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries. And harnessing these new technologies effectively will transform scientific disciplines ranging from high-energy physics to the life sciences. In this paper, a grid computing environment is proposed and constructed on multiple PC clusters by using Globus Toolkit (GT) and SUN Grid Engine (SGE). The experimental results are also conducted by using the matrix multiplication to demonstrate the performance. On the other hand, the approaches to deal with scheduling and load balancing on multiple heterogeneous PC clusters computer system are not mature. Self-scheduling schemes which are suitable for parallel loops with independent iterations on heterogeneous cluster computer system have been designed in the past. However, these schemes, such as FSS, GSS and TSS, can not achieve load balancing in extremely heterogeneous environment. We propose a heuristic approach based upon a two-phase scheme to solve parallel regular loop scheduling problem on an extremely heterogeneous grid computing environment.

[1]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .

[2]  Ian T. Foster,et al.  A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[3]  Jonathan Armstrong,et al.  Introduction to grid computing with globus , 2003 .

[4]  William Gropp,et al.  Beowulf Cluster Computing with Linux , 2003 .

[5]  W. Allcock,et al.  GridFTP protocol specification , 2002 .

[6]  Elizabeth A. Post,et al.  Evaluating the parallel performance of a heterogeneous system , 2001 .

[7]  L.M. Ni,et al.  Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers , 1993, IEEE Trans. Parallel Distributed Syst..

[8]  Ian T. Foster,et al.  Data management and transfer in high-performance computational grid environments , 2002, Parallel Comput..

[9]  Ian Foster,et al.  The Grid: A New Infrastructure for 21st Century Science , 2002 .

[10]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[11]  Chao-Tung Yang,et al.  A Parallel Loop Self-Scheduling on Extremely Heterogeneous PC Clusters , 2004 .

[12]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[13]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[14]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[15]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[16]  Steven Tuecke,et al.  Protocols and services for distributed data-intensive science , 2002 .

[17]  Gary B. Lamont,et al.  Load balancing for heterogeneous clusters of PCs , 2002, Future Gener. Comput. Syst..

[18]  Hui Li,et al.  Locality and Loop Scheduling on NUMA Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[19]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[20]  Anthony T. Chronopoulos,et al.  A class of loop self-scheduling for heterogeneous clusters , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.