Algorithmic Issues on Heterogeneous Computing Platforms Algorithmic Issues on Heterogeneous Computing Platforms Algorithmic Issues on Heterogeneous Computing Platforms

This paper discusses some algorithmic issues when computing with a heterogeneous network of workstations (the typical poor man's parallel computer). Dealing with processors of diierent speeds requires to use more involved strategies than block-cyclic data distributions. Dynamic data distribution is a rst possibility but may prove impractical and not scalable due to communication and control overhead. Static data distributions tuned to balance execution times constitute another possibility but may prove ineecient due to variations in the processor speeds (e.g. because of diierent work-loads during the computation). We introduce a static distribution strategy that can be reened on the y, and we show that it is well-suited to parallelizing scientiic computing applications such as nite-diierence stencils or LU decomposition. RRsumm Cet article traite de probllmes algorithmiques liis l'impllmen-tation de programmes paralìeles sur un rrseau de stations hhttro-ggne (typiquement la machine paralllle du programmeur pauvre). Si dans le cadre d'un rrseau de machines homoggnes, une distribution cyclique des donnnes est souvent optimale, elle n'est plus du tout adaptte cette nouvelle connguration. Plusieurs stratt-gies sont envisageables : en particulier, s'opposent les distributions eeectuues dynamiquement (surcoot de communications) celles ef-fectuues statiquement (mauvaise estimations due aux variations de charge des machines). Ici, nous ddcoupons l'espace d'ittration en tranches ; la n de l'exxcution d'une tranche, connaissant le temps passs par chaque processeur au calcul, nous rrrvaluons leur vitesse ; Nous redistribuons alors les donnnes de maniire optimale pour l'exxcution de la tranche suivante. Finalement, nous montrons par des experimentations que notre approche est appropriie pour de nombreux noyaux de calcul scientiique comme la ddcomposition LU ou des probllmes aux diiirences nies. Abstract This paper discusses some algorithmic issues when computing with a heterogeneous network of workstations (the typical poor man's parallel computer). Dealing with processors of diierent speeds requires to use more involved strategies than block-cyclic data distributions. Dynamic data distribution is a rst possibility but may prove impractical and not scalable due to communication and control overhead. Static data distributions tuned to balance execution times constitute another possibility but may prove ineecient due to variations in the processor speeds (e.g. because of diierent workloads during the computation). We introduce a static distribution strategy that can be reened on the y, and we show that it is well-suited to parallelizing scientiic computing applications such as nite-diierence stencils or LU decomposition.

[1]  Yves Robert,et al.  Algorithmic Issues on Heterogeneous Computing Platforms , 1999, Parallel Process. Lett..

[2]  Francine Berman,et al.  High-performance schedulers , 1998 .

[3]  Xiaodong Zhang,et al.  Coordinating Parallel Processes on Networks of Workstations , 1997, J. Parallel Distributed Comput..

[4]  Yves Robert,et al.  Tiling with limited resources , 1997, Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors.

[5]  Stergios V. Anastasiadis,et al.  Parallel Application Scheduling on Networks of Workstations , 1997, J. Parallel Distributed Comput..

[6]  Larry Carter,et al.  Determining the idle time of a tiling , 1997, POPL '97.

[7]  Sanjay Ranka,et al.  Runtime support for parallelization of data-parallel applications on adaptive and nonuniform computational environments , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[8]  Srinivasan Parthasarathy,et al.  Customized dynamic load balancing for a network of workstations , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[9]  James Demmel,et al.  ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[10]  S. Rajopadhye Optimal Tiling of Two-Dimensional Uniform Recurrences , 1996 .

[11]  Hiroshi Ohta,et al.  Optimal tile size adjustment in compiling general DOACROSS loop nests , 1995, ICS '95.

[12]  Jack J. Dongarra,et al.  Software Libraries for Linear Algebra Computations on High Performance Computers , 1995, SIAM Rev..

[13]  Francine Berman,et al.  Program Speedup in a Heterogeneous Computing Network , 1994, J. Parallel Distributed Comput..

[14]  Andrew S. Grimshaw,et al.  Metasystems: An Approach Combining Parallel Processing and Heterogeneous Distributed Computing Systems , 1994, J. Parallel Distributed Comput..

[15]  Michel Remoissenet,et al.  Waves called solitons , 1994 .