A Lightweight Approach of Automatic Resource Configuration in Distributed Computing

A key problem in executing performance critical applications on distributed computing environments (e.g. the Grid) is the selection of resources for execution. A lot of research related to "automatic resource selection" has been made to allocate the best-effort resources on behalf of users to optimize the execution performance. However most of current approaches are based on the static principle (i.e. resource selection is performed prior to execution) and need detailed application specific information. In the paper, we introduce a lightweight approach for automatic resource selection/configuration. This approach is based on a simple control theory: the application continuously reports performance values to the middleware Application Agent (AA), which relies on the reported values to decide how to dynamically reconfigure the execution environment during the execution to ensure users’ performance requirements (e.g. execution deadline, running N iteration per second). We divide the research into two paradigms: neglecting network latency and considering network latency. For the first paradigm, we use a linear prediction with Kalman filter to find the expected RC to satisfy certain performance requirement. For the second, we let AA probe possible RCs and rollback the bad RCs, to look for a local optimized RC that can provide applications highest performance.

[1]  Rajesh Raman,et al.  Matchmaking frameworks for distributed resource management , 2000 .

[2]  Jeffrey S. Vetter,et al.  Autopilot: adaptive control of distributed applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[3]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[4]  Andrew A. Chien,et al.  Automatic resource specification generation for resource selection , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[5]  Gyun Woo,et al.  An Automatic Resource Selection Scheme for Grid Computing Systems , 2005, ICCSA.

[6]  Rajkumar Buyya,et al.  A Deadline and Budget Constrained Cost-Time Optimisation Algorithm for Scheduling Task Farming Applications on Global Grids , 2002, ArXiv.

[7]  Tomàs Margalef,et al.  MATE: Monitoring, Analysis and Tuning Environment for parallel/distributed applications: Research Articles , 2007 .

[8]  James Arthur Kohl,et al.  The PVM 3.4 tracing facility and XPVM 1.1 , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[9]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[10]  Michael M. Resch,et al.  Performance Prediction Based Resource Selection in Grid Environments , 2007, HPCC.

[11]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[12]  Tomàs Margalef,et al.  MATE: Monitoring, Analysis and Tuning Environment for parallel/distributed applications , 2007, Concurr. Comput. Pract. Exp..

[13]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[14]  Hao Liu,et al.  A Software Framework to Support Adaptive Applications in Distributed/Parallel Computing , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.