A methodology for adaptive resolution of numerical problems on heterogeneous hierarchical clusters

Solving a target problem by using a single algorithm or writing portable programs that perform well is not always efficient on any parallel environment due to the increasing diversity of existing computational supports where new characteristics are influencing the execution of parallel applications. The inherent heterogeneity and the diversity of networks of such environments represent a great challenge to efficiently implement parallel applications for high performance computing. Our objective within this work is to propose a generic framework based on adaptive techniques for solving a class of numerical problems on cluster-based heterogeneous hierarchical platforms. Toward this goal, we refer to adaptive approaches to better adapt a given application to a target parallel system. We apply this methodology on a basic numerical problem, namely solving the matrix multiplication problem, while determining an adaptive execution scheme minimizing the overall execution time depending on the problem and architecture parameters.

[1]  Frédéric Suter,et al.  Impact of mixed-parallelism on parallel implementations of the Strassen and Winograd matrix multiplication algorithms: Research Articles , 2004 .

[2]  Mitsuhisa Sato,et al.  Parallel implementation of Strassen's matrix multiplication algorithm for heterogeneous clusters , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  Charles E. Leiserson,et al.  Space-Efficient Scheduling of Multithreaded Computations , 1998, SIAM J. Comput..

[4]  Allen D. Malony,et al.  Performance Modeling for Dynamic Algorithm Selection , 2003, International Conference on Computational Science.

[5]  Yves Robert,et al.  Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..

[6]  Viktor K. Prasanna,et al.  Adaptive matrix multiplication in heterogeneous environments , 2002, Ninth International Conference on Parallel and Distributed Systems, 2002. Proceedings..

[7]  Frédéric Suter,et al.  Impact of mixed‐parallelism on parallel implementations of the Strassen and Winograd matrix multiplication algorithms , 2004, Concurr. Pract. Exp..

[8]  Pierre-François Dutot,et al.  Scheduling Parallel Tasks Approximation Algorithms , 2004, Handbook of Scheduling.

[9]  Thomas Rauber,et al.  Adaptive Selection of Communication Methods to Optimize Collective MPI Operations , 2005, PARCO.

[10]  Zizhong Chen,et al.  Self-adapting software for numerical linear algebra and LAPACK for clusters , 2003, Parallel Comput..

[11]  Thierry Gautier,et al.  Algorithmes parallèles à grain adaptatif et applications , 2005, Tech. Sci. Informatiques.

[12]  Nancy M. Amato,et al.  A framework for adaptive algorithm selection in STAPL , 2005, PPoPP.

[13]  Bruce Lowekamp,et al.  ECO: Efficient Collective Operations for communication on heterogeneous networks , 1996, Proceedings of International Conference on Parallel Processing.

[14]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[15]  Dieter K. Hammer,et al.  Analysis and prediction of performance for evolving architectures , 2004, Proceedings. 30th Euromicro Conference, 2004..

[16]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[17]  Anthony Skjellum,et al.  A poly‐algorithm for parallel dense matrix multiplication on two‐dimensional process grid topologies , 1997 .

[18]  Jack Dongarra,et al.  DEPLOYING PARALLEL NUMERICAL LIBRARY ROUTINES TO CLUSTER COMPUTING IN A SELF ADAPTING FASHION , 2002 .

[19]  Thomas Rauber,et al.  Multilevel hierarchical matrix multiplication on clusters , 2004, ICS '04.

[20]  Franck Cappello,et al.  An algorithmic model for heterogeneous hyper-clusters: rationale and experience , 2005, Int. J. Found. Comput. Sci..

[21]  Luiz Angelo Steffenel,et al.  Identifying Logical Homogeneous Clusters for Efficient Wide-Area Communications , 2004, PVM/MPI.

[22]  Alexey L. Lastovetsky,et al.  On performance analysis of heterogeneous parallel algorithms , 2004, Parallel Comput..

[23]  Chris Peterson,et al.  Implementing a Performance Forecasting System for Metacomputing The Network Weather Service , 1997, ACM/IEEE SC 1997 Conference (SC'97).