Management of soa-based, data-intensive applications deployed in a distributed cloud subject to response time percentile service level agreements

We consider geographically distributed datacenters forming a collectively managed cloud computing system. Multiple SOA-based applications are hosted in the cloud. Service Level Agreements which dictate Quality of Service (QoS) and pricing models, are negotiated between the cloud provider and the SOA-based enterprise application vendor. A QoS metric that has been explored in large distributed applications is the percentile of response times; this metric provides a form of guarantee on the shape of the response time distribution for the customer. Typical percentile SLAs require the response time of a certain percentile of the input requests from particular classes of customers to be less than a specified value; if this value is exceeded, a penalty is charged to the cloud provider. In addition, the applications we consider are data-intensive with strict temporal order constraints that have to be enforced on requests within the same session of a customer. We propose Data-aware Session-grained Allocation with gi-FIFO Scheduling (DSAgS), a novel decentralized request management scheme deployed in each of the geographically distributed datacenters, to globally reduce the penalty charged to the cloud computing system. Our simulation and prototype-based evaluation shows that our dynamic scheme far outperforms commonly deployed management policies (typically employing static or random allocation with First In First Out, Weighted Round Robin or dynamic priority-based scheduling). We further optimize our solution by proposing a "context-level" cache replacement algorithm.