Enabling Self-Managing Applications using Model-based Online Control Strategies

The increasing heterogeneity, dynamism and uncertainty of emerging DCE (Distributed Computing Environment) systems imply that an application must be able to detect and adapt to changes in its state, its requirements and the state of the system to meet its desired QoS constraints. As system and application scales increase, ad hoc heuristic-based approaches to application adaptation and self-management quickly become insufficient. This paper builds on the Accord programming system for rule-based self-management and extends it with model-based control and optimization strategies. This paper also presents the development of a self-managing data streaming service based on online control using Accord. This service is part of a Grid-based fusion simulation workflow consisting of long-running simulations, executing on remote supercomputing sites and generating several terabytes of data, which must then be streamed over a wide-area network for live analysis and visualization. The self-managing data streaming service minimize data streaming overheads on the simulations, adapt to dynamic network bandwidth and prevent data loss. An evaluation of the service demonstrating its feasibility is presented.

[1]  Scott Klasky,et al.  Grid-based Parallel Data Streaming Implemented for the Gyrokinetic Toroidal Code , 2003 .

[2]  Thomas A. Corbi,et al.  The dawning of the autonomic computing era , 2003, IBM Syst. J..

[3]  Sherif Abdelwahed,et al.  Online safety control of a class of hybrid systems , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[4]  Salim Hariri,et al.  A component-based programming model for autonomic applications , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[5]  Yixin Diao,et al.  IBM Research Report Applying Control Theory to Computing Systems , 2004 .

[6]  David M. Eyers,et al.  An asynchronous rule-based approach for business process automation using obligations , 2002, RULE '02.

[7]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[8]  Gail E. Kaiser,et al.  A control theory foundation for self-managing computing systems , 2005, IEEE Journal on Selected Areas in Communications.

[9]  Karsten Schwan,et al.  Dynamic adaptation of real-time software , 1991, TOCS.

[10]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[11]  Chenyang Lu,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Aqueduct: Online Data Migration with Performance Guarantees , 2022 .

[12]  Yixin Diao,et al.  Feedback Control of Computing Systems , 2004 .

[13]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[14]  Sang Hyuk Son,et al.  Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms* , 2001, Real-Time Systems.

[15]  Micah Beck,et al.  The Logistical Computing Stack - A Design For Wide-Area, Scalable, Uninterruptible Computing , 2002 .

[16]  Nagarajan Kandasamy,et al.  Self-optimization in computer systems via on-line control: application to power management , 2004 .

[17]  Jun-Jang Jeng,et al.  RuleBAM: a rule-based framework for business activity management , 2004, IEEE International Conference onServices Computing, 2004. (SCC 2004). Proceedings. 2004.

[18]  Nagarajan Kandasamy,et al.  Online control for self-management in computing systems , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[19]  Joseph L. Hellerstein,et al.  Using Control Theory to Achieve Service Level Objectives In Performance Management , 2002, Real-Time Systems.

[20]  Emil C. Lupu,et al.  Conflicts in Policy-Based Distributed Systems Management , 1999, IEEE Trans. Software Eng..

[21]  K. Shin,et al.  Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach , 2002, IEEE Trans. Parallel Distributed Syst..

[22]  Salim Hariri,et al.  Autonomic Computing : Concepts, Infrastructure, and Applications , 2006 .

[23]  Nagarajan Kandasamy,et al.  A Hierarchical Optimization Framework for Autonomic Performance Management of Distributed Computing Systems , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[24]  Saverio Mascolo Classical control theory for congestion avoidance in high-speed Internet , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[25]  Joseph L. Hellerstein,et al.  Predictive algorithms in the management of computer systems , 2002, IBM Syst. J..

[26]  Salim Hariri,et al.  Autonomic Computing , 2007 .

[27]  Nagarajan Kandasamy,et al.  A control-based framework for self-managing distributed computing systems , 2004, WOSS '04.

[28]  Cecilia Mascolo,et al.  A micro-economic approach to conflict resolution in mobile computing , 2002, SIGSOFT '02/FSE-10.

[29]  R. Samtaney,et al.  Grid -Based Parallel Data Streaming implemented for the Gyrokinetic Toroidal Code , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[30]  Kevin Skadron,et al.  Control-theoretic dynamic frequency and voltage scaling for multimedia workloads , 2002, CASES '02.

[31]  Tariq Samad Control of Communication Networks , 2001 .

[32]  MANISH PARASHAR,et al.  Conceptual and Implementation Models for the Grid , 2005, Proceedings of the IEEE.

[33]  M. Parashar,et al.  Accord: a programming framework for autonomic applications , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[34]  Scott Klasky,et al.  High performance threaded data streaming for large scale simulations , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[35]  Scott Klasky,et al.  An autonomic service architecture for self-managing grid applications , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[36]  Max E. Valentinuzzi Handbook of bioinspired algorithms and applications , 2006, BioMedical Engineering OnLine.

[37]  Kevin Skadron,et al.  Power-aware QoS management in Web servers , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[38]  Nagarajan Kandasamy,et al.  An online predictive control framework for designing self-managing computing systems , 2005, Multiagent Grid Syst..

[39]  T. Hahm,et al.  Turbulent transport reduction by zonal flows: massively parallel simulations , 1998, Science.

[40]  Karl-Erik Årzén,et al.  Feedback–Feedforward Scheduling of Control Tasks , 2002, Real-Time Systems.

[41]  Yixin Diao,et al.  Applying Control Theory to Computing Systems , 2004 .

[42]  Tony Andrews Business Process Execution Language for Web Services Version 1.1 , 2003 .

[43]  David L. Cohn,et al.  Autonomic Computing , 2003, ISADS.