On-line automatic resource selection in distributed computing

A key problem in executing performance critical applications on distributed computing environments (e.g. the Grid) is the resource selection. Research related to ”automatic resource selection” aims to select suitable resources on behalf of users to optimize the execution performance. However, most of current approaches are based on the static principle (i.e. resource selection is performed prior to execution) and need detailed application-specific information. In the paper, we introduce a novel on-line automatic resource selection approach. This approach is based on a simple control theory: the application continuously reports the Execution Satisfaction Degree (ESD) to the middleware Application Agent (AA), which relies on the reported ESD values to learn the execution behavior and tune the execution environment by adding/replacing/deleting resources at runtime in order to satisfy users' performance requirements. We use a utility-based learning and tuning algorithm to enable the automatic resource tuning/selection. A typical 2-D heat equation application is used to validate the approach. Results show that without resource or application knowledge being provided in advance, the approach is able to find the best-effort resources to satisfy users' execution requirements, and classify resources according to their contribution made to the application.

[1]  Rajkumar Buyya,et al.  A Deadline and Budget Constrained Cost-Time Optimisation Algorithm for Scheduling Task Farming Applications on Global Grids , 2002, ArXiv.

[2]  Michael M. Resch,et al.  Performance Prediction Based Resource Selection in Grid Environments , 2007, HPCC.

[3]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[4]  Jeff Linderoth,et al.  Metacomputing and the Master-Worker Paradigm , 2000 .

[5]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[6]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[7]  Tomàs Margalef,et al.  MATE: Monitoring, Analysis and Tuning Environment for parallel/distributed applications , 2007, Concurr. Comput. Pract. Exp..

[8]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[9]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[10]  Laxmikant V. Kalé,et al.  Supporting dynamic parallel object arrays , 2001, JGI '01.

[11]  Andrew A. Chien,et al.  Automatic resource specification generation for resource selection , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[12]  Marcin Paprzycki,et al.  Parallel computing works! , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[13]  Jeffrey S. Vetter,et al.  Autopilot: adaptive control of distributed applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[14]  Chenyang Lu,et al.  Feedback performance control in software services , 2003 .

[15]  Hao Liu,et al.  A Software Framework to Support Adaptive Applications in Distributed/Parallel Computing , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[16]  Peter A. Dinda,et al.  Time-sharing parallel applications through performance-targeted feedback-controlled real-time scheduling , 2008, Cluster Computing.

[17]  Hao Liu,et al.  Preliminary Resource Management for Dynamic Parallel Applications in the Grid , 2008, GridNets.

[18]  James Arthur Kohl,et al.  The PVM 3.4 tracing facility and XPVM 1.1 , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[19]  Chenyang Lu,et al.  Feedback utilization control in distributed real-time systems with end-to-end tasks , 2005, IEEE Transactions on Parallel and Distributed Systems.

[20]  Tomàs Margalef,et al.  MATE: Monitoring, Analysis and Tuning Environment for parallel/distributed applications: Research Articles , 2007 .