Approximation Modeling for the Online Performance Management of Distributed Computing Systems

A promising method of automating management tasks in computing systems is to formulate them as control or optimization problems in terms of performance metrics. For an online optimization scheme to be of practical value in a distributed setting, however, it must successfully tackle the curses of dimensionality and modeling. This paper develops a hierarchical control framework to solve performance management problems in distributed computing systems operating in a data center. Concepts from approximation theory are used to reduce the computational burden of controlling such large-scale systems. The relevant approximations are made in the construction of the dynamical models to predict system behavior and in the solution of the associated control equations. Using a dynamic resource-provisioning problem as a case study, we show that a computing system managed by the proposed control framework with approximation models realizes profit gains that are, in the best case, within 1% of a controller using an explicit model of the system.

[1]  Masahito Yamada,et al.  Structural Time Series Models and the Kalman Filter , 1989 .

[2]  David E. Culler,et al.  USENIX Association Proceedings of USITS ’ 03 : 4 th USENIX Symposium on Internet Technologies and Systems , 2003 .

[3]  D.A. Lowther,et al.  Selection of approximation models for electromagnetic device optimization , 2006, IEEE Transactions on Magnetics.

[4]  David M. Brooks,et al.  Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.

[5]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[6]  Alaa F. Sheta,et al.  Prediction of software reliability: a comparison between regression and neural network non-parametric models , 2001, Proceedings ACS/IEEE International Conference on Computer Systems and Applications.

[7]  Kang G. Shin,et al.  Real-time dynamic voltage scaling for low-power embedded operating systems , 2001, SOSP.

[8]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[9]  Nagarajan Kandasamy,et al.  A control-based framework for self-managing distributed computing systems , 2004, WOSS '04.

[10]  Nagarajan Kandasamy,et al.  Risk-Aware Limited Lookahead Control for Dynamic Resource Provisioning in Enterprise Computing Systems , 2006 .

[11]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[12]  J. P. Bigus Applying neural networks to computer system performance tuning , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[13]  Jin Zhang,et al.  NN control of discrete-time MIMO systems with input delay , 2006, 2006 American Control Conference.

[14]  Jian Zhang,et al.  Optimal resource allocation scheme for maximizing revenue in the future IP networks , 2004, APCC/MDMC '04. The 2004 Joint Conference of the 10th Asia-Pacific Conference on Communications and the 5th International Symposium on Multi-Dimensional Mobile Communications Proceeding.

[15]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[16]  Nagarajan Kandasamy,et al.  Online control for self-management in computing systems , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[17]  Qin Li,et al.  Understanding the performance of enterprise applications , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[18]  Rajarshi Das,et al.  Model-Based and Model-Free Approaches to Autonomic Resource Allocation , 2005 .

[19]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[20]  Marios M. Polycarpou,et al.  Automated fault detection and accommodation: a learning systems approach , 1995, IEEE Trans. Syst. Man Cybern..

[21]  K. S. Narendra,et al.  Neural networks for control theory and practice , 1996, Proc. IEEE.

[22]  Claudio Scordino,et al.  Energy-Efficient Real-Time Heterogeneous Server Clusters , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[23]  J. S. Baras Modeling and simulation of telecommunication networks for control and management , 2003, Proceedings of the 2003 Winter Simulation Conference, 2003..

[24]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[25]  Asser N. Tantawi,et al.  Performance management for cluster based Web services , 2003 .

[26]  A. Nerode,et al.  Hybrid Control Systems: An Introductory Discussion to the Special Issue , 1998, IEEE Trans. Autom. Control..

[27]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[28]  Sally A. McKee,et al.  Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.

[29]  Martin Arlitt,et al.  A workload characterization study of the 1998 World Cup Web site , 2000, IEEE Netw..

[30]  Joseph L. Hellerstein,et al.  Using Control Theory to Achieve Service Level Objectives In Performance Management , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[31]  Nagarajan Kandasamy,et al.  Approximation Modeling for the Online Performance Management of Distributed Computing Systems , 2007, ICAC.

[32]  Junmin Li,et al.  Adaptive neural control for a class of nonlinearly parametric time-delay systems , 2005, IEEE Transactions on Neural Networks.

[33]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[34]  Prashant J. Shenoy,et al.  Dynamic Provisioning of Multi-tier Internet Applications , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[35]  L. Wasserman All of Nonparametric Statistics , 2005 .

[36]  Martin Arlitt,et al.  Workload Characterization of the 1998 World Cup Web Site , 1999 .

[37]  J. Si,et al.  Neural network-based control design: an LMI approach , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[38]  Andrew Harvey,et al.  Forecasting, Structural Time Series Models and the Kalman Filter , 1990 .

[39]  Chandra Krintz,et al.  A run-time, feedback-based energy estimation model For embedded devices , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[40]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.