QACO: exploiting partial execution in web servers

Web servers provide content to users, with the requirement of providing high response quality within a short response time. Meeting these requirements is challenging, especially in the event of load spikes. Meanwhile, we observe that a response to a request can be adapted or partially executed depending on current resource availability at the server. For example, a web server can choose to send a low or medium resolution image instead of sending the original high resolution image under resource contention. In this paper, we exploit partial execution to expose a trade off between resource consumption and service quality. We show how to manage server resources to improve service quality and responsiveness. Specifically, we develop a framework, called Quota-based Control Optimization (QACO). The quota represents the total amount of resources available for all pending requests. QACO consists of two modules: (1) A control module adjusts the quota to meet the response time target. (2) An optimization module exploits partial execution and allocates the quota to pending requests in a manner that improves total response quality. We evaluate the framework using a system implementation in the Apache Web server, and using a simulation study of a Video-on-Demand server. The results show that under a response time target, QACO achieves a higher response quality than traditional techniques that admit or reject requests without exploiting partial execution.

[1]  Lui Sha,et al.  Queueing model based network server performance control , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[2]  Sang Hyuk Son,et al.  A feedback control approach for guaranteeing relative delays in Web servers , 2001, Proceedings Seventh IEEE Real-Time Technology and Applications Symposium.

[3]  Mihaela van der Schaar,et al.  Decomposition Principles and Online Learning in Cross-Layer Optimization for Delay-Sensitive Applications , 2008, IEEE Transactions on Signal Processing.

[4]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[5]  Lui Sha,et al.  Queueing-Model-Based Adaptive Control of Multi-Tiered Web Applications , 2008, IEEE Transactions on Network and Service Management.

[6]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[7]  Anastasios Gounaris,et al.  Honoring SLAs on cloud computing services: A control perspective , 2009, 2009 European Control Conference (ECC).

[8]  Joni da Silva Fraga,et al.  Implementing quality of service in Web servers , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[9]  Jon Lee A First Course in Combinatorial Optimization: Polytopes and Linear Programming , 2004 .

[10]  Mihaela van der Schaar,et al.  Multimedia Over IP and Wireless Networks: Compression, Networking, and Systems , 2012 .

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[13]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[14]  Daniel Mossé,et al.  Stochastic approximation control of power and tardiness in a three-tier web-hosting cluster , 2010, ICAC '10.

[15]  Cong Shen,et al.  Optimal Resource Allocation for Multimedia Applications over Multiaccess Fading Channels , 2008, IEEE Transactions on Wireless Communications.

[16]  Eric A. Brewer,et al.  Adapting to network and client variation using infrastructural proxies: lessons and perspectives , 1998, IEEE Wirel. Commun..

[17]  Wei-Ying Ma,et al.  Detecting web page structure for adaptive viewing on small form factor devices , 2003, WWW '03.

[18]  Stephen P. Boyd,et al.  Branch and Bound Methods , 1987 .

[19]  Chenyang Lu,et al.  Introduction to Control Theory And Its Application to Computing Systems , 2008 .

[20]  Ludmila Cherkasova,et al.  Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites , 2002, IEEE Trans. Computers.

[21]  Adam Wierman,et al.  Open Versus Closed: A Cautionary Tale , 2006, NSDI.

[22]  Lachlan L. H. Andrew,et al.  Dynamic Right-Sizing for Power-Proportional Data Centers , 2011, IEEE/ACM Transactions on Networking.

[23]  Mor Harchol-Balter,et al.  Web servers under overload: How scheduling can help , 2006, TOIT.

[24]  Jeffrey S. Chase,et al.  Automated control for elastic storage , 2010, ICAC '10.

[25]  Wojciech Szpankowski Bounds for Queue Lengths in a Contention Packet Broadcast System , 1986, IEEE Trans. Commun..

[26]  Prasant Mohapatra,et al.  Session-based overload control in QoS-aware Web servers , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[27]  Riccardo Bettati,et al.  Imprecise computations , 1994, Proc. IEEE.

[28]  Qiang Fu,et al.  Budget-based control for interactive services with adaptive execution , 2012, ICAC '12.

[29]  Qinghe Du,et al.  Statistical QoS provisionings for wireless unicast/multicast of multi-layer video streams , 2010, IEEE Journal on Selected Areas in Communications.

[30]  Tarek F. Abdelzaher,et al.  Web Content Adaptation to Improve Server Overload Behavior , 1999, Comput. Networks.

[31]  Yixin Diao,et al.  Feedback Control of Computing Systems , 2004 .

[32]  C. Muthusamy,et al.  Control Systems application in Java based Enterprise and Cloud Environments – A Survey , 2011 .

[33]  Sameh Elnikety,et al.  Tians Scheduling: Using Partial Processing in Best-Effort Applications , 2011, 2011 31st International Conference on Distributed Computing Systems.

[34]  Rong Zheng,et al.  Timing Performance Control in Web Server Systems Utilizing Server Internal State Information , 2005, Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services - (icas-isns'05).