Staffing and Control of Large-Scale Service Systems with Multiple Customer Classes and Fully Flexible Servers

We study large-scale service systems with multiple customer classes and many statistically identical servers. The following question is addressed: How many servers are required (staffing) and how does one match them with customers (control) in order to minimize cost or maximize profit, subject to quality of service (QoS) constraints? We tackle this question by characterizing scheduling and staffing schemes that are asymptotically optimal in the limit, as system load grows to infinity. The main asymptotic regime considered is the many-server heavy-traffic Quality and Efficiency Driven (QED) regime. The Efficiency Driven (ED) regime is also studied. In the QED regime, which was formally introduced by Halfin and Whitt, a delicate balance is obtained between server efficiencies and quality of service. This balance is enabled by the economies of scale associated with the system size. Our main findings are: a) Decoupling of staffing and control, namely (i) Staffing disregards the multi-class nature of the system and is analogous to the staffing of a single class system with the same aggregate demand and the lowest priority class cost and QoS parameters, and (ii) Class level service differentiation is obtained by using a simple threshold-priority (TP) control (with state-independent thresholds), b) Robustness of the staffing and control rules: In the QED regime, our proposed Square-Root Safety (SRS) staffing rule and TP control are asymptotically optimal with respect to various problem formulations and model assumptions. c) The QED and ED regimes are obtained as solutions of the joint staffing and control problem rather than as assumptions.