Performance modeling of parallel applications for grid scheduling

Grids consist of both dedicated and non-dedicated clusters. For effective mapping of parallel applications on grid resources, a grid metascheduler has to evaluate different sets of resources in terms of predicted execution times for the applications when executed on the sets of resources. In this work, we have developed a comprehensive set of performance modeling strategies for predicting execution times of parallel applications on both dedicated and non-dedicated environments. Our strategies adapt to changing network and CPU loads on the grid resources. We have evaluated our strategies on 8, 16, 24 and 32-node clusters with random loads and load traces from a grid system. Our strategies give less than 30% average percentage prediction errors in all cases, which, to our knowledge, is the best reported for non-dedicated environments. We also found that grid scheduling using predictions of execution times from our performance modeling techniques will lead to perfect mapping of applications to resources in many cases.

[1]  James E. Smith,et al.  Comparing program phase detection techniques , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[2]  Fabrizio Petrini,et al.  Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[3]  Francine Berman,et al.  Using Stochastic Information to Predict Application Behavior on Contended Resources , 2001, Int. J. Found. Comput. Sci..

[4]  Sally A. McKee,et al.  Methods of inference and learning for performance modeling of parallel applications , 2007, PPoPP.

[5]  Alfons G. Hoekstra,et al.  Dynamic instrumentation and performance prediction of application execution , 2001 .

[6]  Leonid Oliker,et al.  Design Strategies for Irregularly Adapting Parallel Applications , 2001, PPSC.

[7]  Brad Calder,et al.  Detecting phases in parallel applications on shared memory architectures , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[8]  J. Schopf,et al.  Structural Prediction Models for High-Performance Distributed Applications , 1997 .

[9]  Sathish S. Vadhiyar,et al.  Numerical Libraries and the Grid , 2001, Int. J. High Perform. Comput. Appl..

[10]  Xiaodong Zhang,et al.  Erratum: "An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW" , 1997, J. Parallel Distributed Comput..

[11]  Cosimo Anglano,et al.  Predicting parallel applications performance on non-dedicated cluster platforms , 1998, ICS '98.

[12]  Lin Sun,et al.  Semi-Empirical Multiprocessor Performance Predictions , 1996, J. Parallel Distributed Comput..

[13]  Xingfu Wu,et al.  Prophesy: an infrastructure for performance analysis and modeling of parallel and grid applications , 2003, PERV.

[14]  Larry Carter,et al.  Centralized versus distributed schedulers for multiple bag-of-task applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[15]  Graham R. Nudd,et al.  Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..

[16]  YONG YAN,et al.  An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW , 1996, J. Parallel Distributed Comput..

[17]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[18]  Ian T. Foster,et al.  Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[19]  Xin Li,et al.  Prophesy: automating the modeling process , 2001, Proceedings Third Annual International Workshop on Active Middleware Services.

[20]  Mary K. Vernon,et al.  Parallel program performance prediction using deterministic task graph analysis , 2004, TOCS.

[21]  Sally A. McKee,et al.  An Approach to Performance Prediction for Parallel Applications , 2005, Euro-Par.

[22]  Sathish S. Vadhiyar,et al.  Performance Modeling based on Multidimensional Surface Learning for Performance Predictions of Parallel Applications in Non-Dedicated Environments , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[23]  Sathish S. Vadhiyar,et al.  Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[24]  Michael C. Huang,et al.  Program phase detection and exploitation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[26]  R.J. Block,et al.  Automated Performance Prediction of Message-Passing Parallel Programs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[27]  Xiaofeng Gao,et al.  A Performance Prediction Framework for Scientific Applications , 2003, International Conference on Computational Science.

[28]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[29]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[30]  Duncan A. Grove,et al.  Modeling message-passing programs with a Performance Evaluating Virtual Parallel Machine , 2005, Perform. Evaluation.

[31]  Francine Berman,et al.  Performance prediction in production environments , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[32]  Michael T. Heath,et al.  WHOLE SYSTEM SIMULATION OF SOLID PROPELLANT ROCKETS , 2002 .

[33]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..