Adaptive matrix multiplication in heterogeneous environments

In this paper an adaptive matrix multiplication algorithm for dynamic heterogeneous environments is developed and evaluated. Unlike the state-of-the-art approaches, where load balancing is achieved through unequal distribution of the matrix data among the heterogeneous nodes, the matrices in our approach are partitioned into blocks of equal size. Task allocation and the block size are adapted during run time. Data pre-fetch is used to perform efficient communication. Our approach enables the use of various task scheduling heuristics. Further we show that the control and coordination overheads of this approach are negligible when compared with the overall execution time. The effectiveness of the approach is verified through a configurable simulator developed for understanding the performance of heterogeneous computing environments.

[1]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[2]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[3]  R. F. Freund,et al.  Guest Editor's Introduction: Heterogeneous Processing , 1993 .

[4]  R. F. Freund,et al.  Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems , 1999, J. Parallel Distributed Comput..

[5]  Viktor K. Prasanna,et al.  Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[6]  Howard Jay Siegel,et al.  Task execution time modeling for heterogeneous computing systems , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[7]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[8]  Yves Robert,et al.  Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..

[9]  Sathish S. Vadhiyar,et al.  Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[10]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..

[11]  Viktor K. Prasanna,et al.  Run-time adaptation for grid environments , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[12]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[13]  Sathish S. Vadhiyar,et al.  Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[14]  Sathish S. Vadhiyar,et al.  Numerical Libraries and the Grid , 2001, Int. J. High Perform. Comput. Appl..

[15]  Bo Hong,et al.  A modular and extensible simulator for performance evaluation of adaptive applications in heterogeneous computing environments , 2002, Fifth International Conference on Algorithms and Architectures for Parallel Processing, 2002. Proceedings..

[16]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[17]  R. C. Whaley,et al.  Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.